SoFunction
Updated on 2024-11-13

pyTorch Deep Learning Gradient and Linear Regression Implementation

gradient

PyTorch's data structure is a tensor, which has an attribute called requires_grad, which, when set to True, starts tracking all the operations on it, and after the forward computation is complete, the gradient can be passed backward.
When evaluating the model we don't need the gradient return, use with torch.no_grad() to wrap the code snippet that doesn't need the gradient. Each Tensor has a .grad_fn attribute, which is the Function that creates the Tensor, and returns None directly with the constructed tensor, otherwise it is the operation that generated the tensor.

tensor(data, *, dtype=None, device=None, requires_grad=False, pin_memory=False) -> Tensor
#require_grad is false by default, below we will explicitly turn on the
x = ([1,2,3],requires_grad=True,dtype=)

Note that only data types that are floating point and complex can require gradients, so the display here specifies dtype as torch.float32

x = ([1,2,3],requires_grad=True,dtype=torch.float32)
> tensor([1.,2.,3.],grad_fn=None)
y = x + 2
> tensor([3.,4.,5.],grad_fn=<AddBackward0>)
z = y * y * 3
> tensor([3.,4.,5.],grad_fn=<MulBackward0>)

Directly created ones like x, without grad_fn, are called leaf nodes. grad_fn records one of the basic operations used to perform the gradient computation.
For the gradient backpropagation calculation look at the following example

x = ((2,2),requires_grad=True)
> tensor([[1.,1.],
> 		   [1.,1.]],requires_grad=True)
y = x + 2
z = y * y * 3
out = ()
#out is a scalar, there is no need to specify a variable for the partial derivatives
()

> tensor([[4.500,4.500],
> 		  [4.500,4.500]])
#The gradient needs to be zeroed out before each calculation, otherwise it will accumulate.
.zero_()

在这里插入图片描述

It's worth noting that only the gradients of the leaf nodes are computed when passing back, i.e., the above example doesn't pick up the y and z grad.
Let's look at an example of interrupt derivation

x = (1.,requires_grad=True)
y1 = x ** 2
with torch.no_grad()
	y2 = x ** 3
y3 = y1 + y2
()
print()
> 2

在这里插入图片描述

The gradient should have been 5, but since y2 is wrapped by with torch.no_grad(), it won't be tracked during the gradient calculation.

If we want to modify the value of a tensor but don't want it to be recorded by autograd, then we need to use a pair to do the operation on the line which is also a tensor.

Linear regression (linear regression)

Linear regression is used to predict the price of a house, the price depends on many FEATURES, here we simplify the problem by assuming that the price depends on only two factors, the area (square meters) and the age of the house (years)

在这里插入图片描述

x1 represents the area, x2 represents the age of the house, and the sale price is y

Simulated data sets

Assuming that our sample size is 1000 and each data includes two features, the data is a 2-d tensor of 1000 * 2, with values taken randomly using a Positron distribution.
labels is the price of the house, a one-dimensional tensor of length 1000.
The real w and b are set in advance, and then a disturbance quantity δ \delta δ (also taken with a Gaussian distribution to simulate deviations in the real data set) is taken.

num_features = 2# Two features
num_examples = 1000 # of samples
w = (0,1,(num_features,1))
b = (4.2)
samples = (0,1,(num_examples,num_features))
labels = (w) + b
noise = (0,.01,)
labels += noise

Load Dataset

import random
def data_iter(samples,labels,batch_size):
	num_samples = [0] # Get the length of the batch axis
	indices = [i for i in range(num_samples)]
	(indices)# Break the index array in place
	for i in range(0,num_samples,batch_size):
	j = (indices[i:min(i+batch_size,num_samples)])
	yield samples.index_select(0,j),labels(0,j)

torch.index_select(dim,index)
dim denotes the axis of the tensor, and index is a tensor that contains the index.

Define loss_function

def loss_function(predict,labels):
	loss = (predict,labels)** 2 / 2
	return ()

Defining the Optimizer

def loss_function(predict,labels):
	loss = (predict,labels)** 2 / 2
	return ()

Start training

w = (0.,1.,(num_features,1),requires_grad=True)
b = (0.,dtype=torch.float32,requires_grad=True)
batch_size = 100
for epoch in range(10):
	for data, label in data_iter(samples,labels,batch_size):
		predict = (w) + b
		loss = loss_function(predict,label)
		()
		optimizer([w,b],0.05)
		.zero_()
		.zero_() 

Above is pyTorch in-depth learning gradient and Linear Regression implementation of the details, more information about pyTorch implementation of gradient and Linear Regression please pay attention to my other related articles!