python artificial intelligence tensorflow common loss function LOSS summary

preamble

The loss function is used in machine learning to represent the gap between the predicted value and the true value.

In general, most of the machine learning models are optimized to predict the parameters of the machine learning model by using certain optimizers to reduce the loss function and thus to optimally predict the parameters of the machine learning model.

Oh come on, loss functions are so necessary, so what loss functions exist?

The commonly used loss functions are the mean square deviation function and the cross entropy function.

formula

1 Mean square deviation function

The mean square function is mainly used to assess the effectiveness of the use of regression models, the concept is relatively simple, that is, the mean of the squares of the difference between the true value and the predicted value, the specific formula can be expressed as follows:

where f(xi) is the predicted value and yi is the true value. In a two-dimensional image, this function represents the sum of the distances from each scatter point to the y-axis of the fitted curve, which is very intuitive.

2 Cross entropy function

From a concept in information theory, originally meant to be used to estimate the average code length. In the field of machine learning, it is often used as a loss function for classification problems.

How does the cross-entropy function work? Assuming that in a classification problem, the only objects being predicted are yes or no, and that the predicted values are often not absolute predictions such as 1 or 0, it is common practice for predictions to be treated as 1 for predictions greater than 0.5, and as 0 for predictions less than 0.5.

Suppose at this point that if there exists a sample where the predicted value is close to 0 and the actual value is 1, then in the first half of the cross-entropy function:

The result of its operation will be much smaller than 0, and after taking the sign it will be much larger than 0, resulting in a huge loss function of the model. By reducing the cross-entropy function can make the prediction accuracy of the model greatly improved.

Expression of the loss function in tensorflow

1 Mean square deviation function

loss = tf.reduce_mean((logits-labels))
loss = tf.reduce_mean(((logits, labels)))
loss = .mean_squared_error(logits,labels)

2 Cross entropy function

loss = .sigmoid_cross_entropy_with_logits(labels=y,logits=logits)
# Calculation: for the input logits first through the sigmoid function, then calculate their cross-entropy
# But it optimizes the calculation of cross-entropy in a way that keeps the results from overflowing.
loss = .softmax_cross_entropy_with_logits(labels=y,logits=logits)
# Calculation: for the input logits first by softmax function, and then calculate their cross entropy.
# But it optimizes the calculation of cross-entropy in a way that keeps the results from overflowing.

(for) instance

1 Mean square deviation function

This is an example of a primary function fit. The three losses have the same meaning.

import numpy as np
import tensorflow as tf
x_data = (100).astype(np.float32) # Getting random x-values
y_data = x_data * 0.1 + 0.3                     # Calculate the corresponding y-value
Weights = (tf.random_uniform([1],-1.0,1.0))  #random_uniform returns a matrix of size [m,n], generated between low and high, which produces uniformly distributed values.
Biaxs = (([1]))                      # Generate 0
y = Weights*x_data + Biaxs      
loss = .mean_squared_error(y_data,y)              #Calculate the squared difference
#loss = tf.reduce_mean((y_data-y))
#loss = tf.reduce_mean(((y_data,y)))
optimizer = (0.6)      #Gradient descent
train = (loss)
init = tf.initialize_all_variables()
sess = ()
(init)
for i in range(200):
    (train)
    if i % 20 == 0:
        print((Weights),(Biaxs))

The output result is:

[0.10045234] [0.29975605]
[0.10010818] [0.2999417]
[0.10002586] [0.29998606]
[0.10000619] [0.29999667]
[0.10000149] [0.2999992]

2 Cross entropy function

This is an example of Mnist handwriting recognition. Both loss functions can perform cross entropy operations, with different functions passing in between when calculating the loss function.

import tensorflow as tf 
import numpy as np
from  import input_data
mnist = input_data.read_data_sets("MNIST_data",one_hot = "true")
def add_layer(inputs,in_size,out_size,n_layer,activation_function = None):
    layer_name = 'layer%s'%n_layer
    with tf.name_scope(layer_name):
        with tf.name_scope("Weights"):
            Weights = (tf.random_normal([in_size,out_size]),name = "Weights")
            (layer_name+"/weights",Weights)
        with tf.name_scope("biases"):
            biases = (([1,out_size]) + 0.1,name = "biases")
            (layer_name+"/biases",biases)
        with tf.name_scope("Wx_plus_b"):
            Wx_plus_b = (inputs,Weights) + biases
            (layer_name+"/Wx_plus_b",Wx_plus_b)
        if activation_function == None :
            outputs = Wx_plus_b 
        else:
            outputs = activation_function(Wx_plus_b)
        (layer_name+"/outputs",outputs)
        return outputs
def compute_accuracy(x_data,y_data):
    global prediction
    y_pre = (prediction,feed_dict={xs:x_data})
    correct_prediction = (tf.arg_max(y_data,1),tf.arg_max(y_pre,1))     #Determine if they are equal
    accuracy = tf.reduce_mean((correct_prediction,tf.float32))   # Assign float32 data type to average.
    result = (accuracy,feed_dict = {xs:batch_xs,ys:batch_ys})   #Execute
    return result
xs = (tf.float32,[None,784])
ys = (tf.float32,[None,10])
layer1 = add_layer(xs,784,150,"layer1",activation_function = )
prediction = add_layer(layer1,150,10,"layer2")
# Since the loss function automatically performs softmax or sigmoid function operations, no special excitation function is needed.
with tf.name_scope("loss"):
    loss = tf.reduce_mean(.softmax_cross_entropy_with_logits(labels=ys,logits = prediction),name = 'loss')
    #loss = tf.reduce_mean(.sigmoid_cross_entropy_with_logits(labels=ys,logits = prediction),name = 'loss')
    #label is the label, logits are the predicted values, and cross-entropy.
    ("loss",loss)
train = (4e-3).minimize(loss)
init = tf.initialize_all_variables()
merged = .merge_all()
with () as sess:
    (init)
    write = ("logs/",)
    for i in range(5001):
        batch_xs,batch_ys = .next_batch(100)
        (train,feed_dict = {xs:batch_xs,ys:batch_ys})
        if i % 1000 == 0:
            print("The recognition rate for training %d times is: %f."%((i+1),compute_accuracy(,)))
            result = (merged,feed_dict={xs:batch_xs,ys:batch_ys})
            write.add_summary(result,i)

The output result is

The recognition rate for 1 training session is: 0.103100.
The recognition rate for 1001 training sessions is: 0.900700.
The recognition rate for 2001 training sessions is: 0.928100.
The recognition rate for 3001 training sessions is: 0.938900.
The recognition rate for 4001 training sessions is: 0.945600.
The recognition rate for 5001 training sessions is: 0.952100.

Above is the detailed content of python artificial intelligence tensorflowf common loss function LOSS summary, more information about tensorflowf loss function LOSS please pay attention to my other related articles!