preamble
The loss function is used in machine learning to represent the gap between the predicted value and the true value.
In general, most of the machine learning models are optimized to predict the parameters of the machine learning model by using certain optimizers to reduce the loss function and thus to optimally predict the parameters of the machine learning model.
Oh come on, loss functions are so necessary, so what loss functions exist?
The commonly used loss functions are the mean square deviation function and the cross entropy function.
formula
1 Mean square deviation function
The mean square function is mainly used to assess the effectiveness of the use of regression models, the concept is relatively simple, that is, the mean of the squares of the difference between the true value and the predicted value, the specific formula can be expressed as follows:
where f(xi) is the predicted value and yi is the true value. In a two-dimensional image, this function represents the sum of the distances from each scatter point to the y-axis of the fitted curve, which is very intuitive.
2 Cross entropy function
From a concept in information theory, originally meant to be used to estimate the average code length. In the field of machine learning, it is often used as a loss function for classification problems.
How does the cross-entropy function work? Assuming that in a classification problem, the only objects being predicted are yes or no, and that the predicted values are often not absolute predictions such as 1 or 0, it is common practice for predictions to be treated as 1 for predictions greater than 0.5, and as 0 for predictions less than 0.5.
Suppose at this point that if there exists a sample where the predicted value is close to 0 and the actual value is 1, then in the first half of the cross-entropy function:
The result of its operation will be much smaller than 0, and after taking the sign it will be much larger than 0, resulting in a huge loss function of the model. By reducing the cross-entropy function can make the prediction accuracy of the model greatly improved.
Expression of the loss function in tensorflow
1 Mean square deviation function
loss = tf.reduce_mean((logits-labels)) loss = tf.reduce_mean(((logits, labels))) loss = .mean_squared_error(logits,labels)
2 Cross entropy function
loss = .sigmoid_cross_entropy_with_logits(labels=y,logits=logits) # Calculation: for the input logits first through the sigmoid function, then calculate their cross-entropy # But it optimizes the calculation of cross-entropy in a way that keeps the results from overflowing. loss = .softmax_cross_entropy_with_logits(labels=y,logits=logits) # Calculation: for the input logits first by softmax function, and then calculate their cross entropy. # But it optimizes the calculation of cross-entropy in a way that keeps the results from overflowing.
(for) instance
1 Mean square deviation function
This is an example of a primary function fit. The three losses have the same meaning.
import numpy as np import tensorflow as tf x_data = (100).astype(np.float32) # Getting random x-values y_data = x_data * 0.1 + 0.3 # Calculate the corresponding y-value Weights = (tf.random_uniform([1],-1.0,1.0)) #random_uniform returns a matrix of size [m,n], generated between low and high, which produces uniformly distributed values. Biaxs = (([1])) # Generate 0 y = Weights*x_data + Biaxs loss = .mean_squared_error(y_data,y) #Calculate the squared difference #loss = tf.reduce_mean((y_data-y)) #loss = tf.reduce_mean(((y_data,y))) optimizer = (0.6) #Gradient descent train = (loss) init = tf.initialize_all_variables() sess = () (init) for i in range(200): (train) if i % 20 == 0: print((Weights),(Biaxs))
The output result is:
[0.10045234] [0.29975605]
[0.10010818] [0.2999417]
[0.10002586] [0.29998606]
[0.10000619] [0.29999667]
[0.10000149] [0.2999992]
2 Cross entropy function
This is an example of Mnist handwriting recognition. Both loss functions can perform cross entropy operations, with different functions passing in between when calculating the loss function.
import tensorflow as tf import numpy as np from import input_data mnist = input_data.read_data_sets("MNIST_data",one_hot = "true") def add_layer(inputs,in_size,out_size,n_layer,activation_function = None): layer_name = 'layer%s'%n_layer with tf.name_scope(layer_name): with tf.name_scope("Weights"): Weights = (tf.random_normal([in_size,out_size]),name = "Weights") (layer_name+"/weights",Weights) with tf.name_scope("biases"): biases = (([1,out_size]) + 0.1,name = "biases") (layer_name+"/biases",biases) with tf.name_scope("Wx_plus_b"): Wx_plus_b = (inputs,Weights) + biases (layer_name+"/Wx_plus_b",Wx_plus_b) if activation_function == None : outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b) (layer_name+"/outputs",outputs) return outputs def compute_accuracy(x_data,y_data): global prediction y_pre = (prediction,feed_dict={xs:x_data}) correct_prediction = (tf.arg_max(y_data,1),tf.arg_max(y_pre,1)) #Determine if they are equal accuracy = tf.reduce_mean((correct_prediction,tf.float32)) # Assign float32 data type to average. result = (accuracy,feed_dict = {xs:batch_xs,ys:batch_ys}) #Execute return result xs = (tf.float32,[None,784]) ys = (tf.float32,[None,10]) layer1 = add_layer(xs,784,150,"layer1",activation_function = ) prediction = add_layer(layer1,150,10,"layer2") # Since the loss function automatically performs softmax or sigmoid function operations, no special excitation function is needed. with tf.name_scope("loss"): loss = tf.reduce_mean(.softmax_cross_entropy_with_logits(labels=ys,logits = prediction),name = 'loss') #loss = tf.reduce_mean(.sigmoid_cross_entropy_with_logits(labels=ys,logits = prediction),name = 'loss') #label is the label, logits are the predicted values, and cross-entropy. ("loss",loss) train = (4e-3).minimize(loss) init = tf.initialize_all_variables() merged = .merge_all() with () as sess: (init) write = ("logs/",) for i in range(5001): batch_xs,batch_ys = .next_batch(100) (train,feed_dict = {xs:batch_xs,ys:batch_ys}) if i % 1000 == 0: print("The recognition rate for training %d times is: %f."%((i+1),compute_accuracy(,))) result = (merged,feed_dict={xs:batch_xs,ys:batch_ys}) write.add_summary(result,i)
The output result is
The recognition rate for 1 training session is: 0.103100.
The recognition rate for 1001 training sessions is: 0.900700.
The recognition rate for 2001 training sessions is: 0.928100.
The recognition rate for 3001 training sessions is: 0.938900.
The recognition rate for 4001 training sessions is: 0.945600.
The recognition rate for 5001 training sessions is: 0.952100.
Above is the detailed content of python artificial intelligence tensorflowf common loss function LOSS summary, more information about tensorflowf loss function LOSS please pay attention to my other related articles!