Without further ado, a complete neural network is generally composed of three layers: the input layer, the hidden layer (can have multiple layers) and the output layer. The hidden layer of the neural network constructed in this paper has only one layer. A neural network consists of three main parts (code structure): initialization, training, and prediction. First, let's initialize the neural network!
1. Initialization
- What we are initializing includes:The number of neurons on each layer of the neural network (this is based on the actual problem inputs and outputs, we set it to a customizable quantity).
- The value of the weight of the data transmitted to each other between different layers.
- Activation function (mimics a neuron in nature, where the stimulus signal needs to reach a certain level to activate the neuron)
Code below:
def __init__(self, input_nodes_num, hidden_nodes_num, output_nodes_num, lr): # Initialize the number of neurons, which can be modified directly self.input_nodes = input_nodes_num self.hidden_nodes = hidden_nodes_num self.output_nodes = output_nodes_num self.learning_rate = lr # Initialize weight values, randomly initialized using a normal distribution function with a mean of 0 and variance squared by the number of neurons self.w_input_hidden = (0.0, pow(self.hidden_nodes, -0.5), (self.hidden_nodes, self.input_nodes)) self.w_hidden_output = (0.0, pow(self.output_nodes, -0.5), (self.output_nodes, self.hidden_nodes)) # Initialize the activation function, the activation function used Sigmoid function, smoother, close to the natural neuron behavior pattern # lambda defines an anonymous function self.activation_function = lambda x: (x) pass
Let's explain some of the programming knowledge in the above code snippet. First.__init__()
It is a class constructor that is called when building an object of a class, so we put the neural network initialization related code into this function.
self.w_input_hidden = (0.0, pow(self.hidden_nodes, -0.5), (self.hidden_nodes, self.input_nodes))
This code uses thenumpy
library()
function that initializes the weight values for the data transfer between the input and hidden layers, this function randomly generates a normal distribution based on the
self.hidden_nodes*self.input_nodes
The matrix of the (hidden_nodes
cap (a poem)input_nodes
(denotes the number of neurons in the hidden and input layers).
self.activation_function = lambda x: (x)
This code useslambda
An anonymous function is defined and assigned to the activation function, which is a sigmoid function, a smooth curve that is relatively close to the way neurons respond to stimulus signals in nature.
2. Forecasting
According to the normal order, after the initialization is completed, training should be carried out, but due to the training is more complex, and the prediction is more simple and easy to implement, we first complete this part of the code. Prediction requires us to process the input information, weighted summing and transmitted to the hidden layer neurons, after the activation function and weighted summing again, transmitted to the output layer through the output layer neurons to get the final result.The code snippet is below:
def query(self, inputs_list): # Transpose converts row vectors into column vectors to better separate each set of data for subsequent matrix dot product operations. inputs = (inputs_list, ndmin=2).T # Hidden layer output after weighted summation with sigmoid function hidden_inputs = (self.w_input_hidden, inputs) hidden_outputs = self.activation_function(hidden_inputs) # Weighted summation with sigmoid function to get the final output final_inputs = (self.w_hidden_output, hidden_outputs) final_outputs = self.activation_function(final_inputs) # Get the output data columns return final_outputs
There is nothing to say about this code, it is relatively simple, just follow the author's steps above to do. What do not understand can read the comments or leave a comment.
3. Training
Neural network training problem is more complex, involving the forward and back propagation of neural networks, calculus chain rule, matrix operations, partial differential derivation and gradient descent algorithm of some of the knowledge, are some of the basics of machine learning, not to do too much detail here, in a few days I will be a new post to talk about it in detail.Here's a look at the main tasks of the training snippet:
- Training is the same as prediction where we first read in some inputs and predict the outputs, the difference is that in the training phase we are getting data from the training dataset and we know what the correct outputs are, whereas in the prediction phase we only know the inputs and the outputs need to be predicted by the model we are training. First the training phase reads in the inputs and predicts them according to the current model.
- Update the weights between each layer based on the error between the training predictions and the labeled actual results.
Here to post the code:
def train(self, inputs_list, targets_list): # Transform data from training and test sets into column vectors inputs = (inputs_list, ndmin=2).T targets = (targets_list, ndmin=2).T # The input to the hidden layer is the dot product of the training set and the weight values, and the output is the output of the activation function hidden_inputs = (self.w_input_hidden, inputs) hidden_outputs = self.activation_function(hidden_inputs) # The input to the output layer is the output of the hidden layer and the output is the final result final_inputs = (self.w_hidden_output, hidden_outputs) final_outputs = self.activation_function(final_inputs) # Loss function output_errors = targets - final_outputs # The error in the hidden layer is the dot product of the transpose of the weight matrix and the output error hidden_errors = (self.w_hidden_output.T, output_errors) # Updates to the weights self.w_hidden_output += self.learning_rate * ((output_errors * final_outputs * (1.0 - final_outputs)), (hidden_outputs)) self.w_input_hidden += self.learning_rate * ((hidden_errors * hidden_outputs * (1.0 - hidden_outputs)), (inputs))
The above code snippet might be a bit confusing or create a feeling of complexity for some students who are new to machine learning or deep learning, but it's just a combined application of the backpropagation algorithm, the chain rule, and the partial derivatives. I'll talk about what I learned in another essay (probably not well) for those who are interested.
4. Testing
The three-layer neural network was constructed and I usedmnist
The training and test sets were tested against theThe code and results are as follows:
# Initialize the number of neurons in each layer, where the number of input neurons depends on the read-in dependent variable, and the number of output neurons depends on the number of possibilities for classification input_nodes = 784 hidden_nodes = 100 output_nodes = 10 # Learning rate, adjusting step size each time learning_rate = 0.2 n = NeuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate) # Get training set information training_data_file = open('data/mnist_train.csv', 'r') training_data_list = training_data_file.readlines() training_data_file.close() for record in training_data_list: all_values = (',') inputs = ((all_values[1:]) / 255.0 * 0.99) + 0.01 targets = (output_nodes) + 0.01 targets[int(all_values[0])] = 0.99 (inputs, targets) pass print('train successful!') test_file = open('data/mnist_test.csv', 'r') test_list = test_file.readlines() test_file.close() m = (test_list) j = 0.0 for record in test_list: test_values = (',') (test_values) results = ((test_values[1:])) if results[int(test_values[0])] == max(results): j += 1 pass print("correct;" + str(j/m))
Up to this point this article onPython
realize a simple three-layer neural network to build and test the code analysis of the article is introduced to this, more related Python to achieve the three-layer neural network to build and test the contents of the content, please search for my previous articles or continue to browse the following related articles I hope that you will support me in the future more!