neural network
gradient descent
Before we look at the algorithm of gradient descent in detail, let's look at some related concepts.
1. Learning rate: The learning rate determines the length of each step along the negative direction of the gradient during the gradient descent iteration. Using the above example of descending a hill, the step length is the length of the step along the steepest and easiest descent at the current step location.
2. features (feature): refers to the input part of the sample, for example, 2 single-featured samples (x (0), y (0)), (x (1), y (1)) (x (0), y (0)), (x (1), y (1)), then the first sample features x (0) x (0), the first sample output is y (0) y (0).
3. Hypothesis function (hypothesis function): In supervised learning, in order to fit the input samples, the hypothesis function used, denoted as hθ(x)hθ(x). For example, for m samples (x(i),y(i)) of a single feature (i=1,2,.... .m) (x(i),y(i))(i=1,2,.... .m), the fitting function can be used as follows: hθ(x)=θ0+θ1xhθ(x)=θ0+θ1x.
4. Loss function: In order to assess the goodness of fit of a model, the loss function is usually used to measure the degree of fit. A minimized loss function means the best fit and the corresponding model parameters are optimal. In linear regression, the loss function is usually the difference between the sample output and the hypothesis function squared. For example, for m samples (xi,yi)(i=1,2,.... .m) (xi,yi)(i=1,2,.... .m),using linear regression, the loss function is:
J(θ0,θ1)=∑i=1m(hθ(xi)−yi)2J(θ0,θ1)=∑i=1m(hθ(xi)−yi)2
where xixi denotes the ith sample feature, yiyi denotes the output corresponding to the ith sample, and hθ(xi)hθ(xi) is the hypothesis function.
Shared by: Zhang Jiaojuan
This article on the basic principles of Python deep learning neural network is introduced to this article, more related Python neural network content, please search for my previous articles or continue to browse the following related articles I hope you will support me more in the future!