SoFunction
Updated on 2024-11-13

Ultra-detailed PyTorch implementation of handwritten digit recognizer sample code

preamble

There's a lot of toy data in deep learning thatmnistis one of them, and one's ability to get started with deep learning is often based on the ability to play with themnistdata to judge, in the front of a lot of basic introduction we can come to realize a simple handwritten digit recognition of the network

Processing of data

We use pytorch's own packages for data preprocessing

import torch
import torchvision
import  as transforms
import numpy as np
import  as plt

transform = ([
  (),
  ((0.5), (0.5))
])
trainset = (root='./data', train=True, download=True, transform=transform)
trainloader = (trainset, batch_size=32, shuffle=True,num_workers=2)

marginal notes:For standardization of data, implementations
mean:: Mean sum dividing by number
std:Variance Each element minus the mean squared and divided by number

norm_data = (tensor - mean) / std

Here it is straightforward to standardize the image to a range of -1 to 1. The reason for standardization is because if a certain number is very, very large in the data, it leads to a larger weight, which in turn affects the other data, and in itself our data are all equal, so standardization distributes the data to a range of -1 to 1, so that all of the data does not have too much weight leading to huge fluctuations in the network
trainloaderis now an iterable object, then we can use theforThe loop is now traversed, and since it is using the data returned by the yield, in order to save memory

Look at the data.

def imshow(img):
   img = img / 2 + 0.5 # unnormalize
   npimg = ()
   ((npimg, (1, 2, 0)))
   ()
# .make_grid stitching images together
imshow(.make_grid(iter(trainloader).next()[0]))

在这里插入图片描述

network building

from torch import nn
import  as F
class Net():
  def __init__(self):
    super(Net, self).__init__()
    self.conv1 = nn.Conv2d(in_channels=1, out_channels=28, kernel_size=5) # 14
     = nn.MaxPool2d(kernel_size=2, stride=2) # No parameters to learn so no need to set up two
    self.conv2 = nn.Conv2d(in_channels=28, out_channels=28*2, kernel_size=5) # 7
    self.fc1 = (in_features=28*2*4*4, out_features=1024)
    self.fc2 = (in_features=1024, out_features=10)
  def forward(self, inputs):
    x = ((self.conv1(inputs)))
    x = ((self.conv2(x)))
    x = (()[0],-1)
    x = (self.fc1(x))
    return self.fc2(x)

Here is a dynamic demonstration of the convolution

在这里插入图片描述

in_channels: is the number of input channels Color pictures have 3 channels Black and white have 1 channel
out_channels: Number of output channels
kernel_size:Size of convolution kernel
stride:Step size of convolution
padding:: Outer margin size

Output size calculation formula

  • h = (h - kernel_size + 2*padding)/stride + 1
  • w = (w - kernel_size + 2*padding)/stride + 1

MaxPool2d: is not parameterized for arithmetic

在这里插入图片描述

Instantiate the network optimizer and train it using GPUs

net = Net()
opt = (params=(), lr=0.001)
device = ("cuda:0" if .is_available() else "cpu")
(device)
Net(
 (conv1): Conv2d(1, 28, kernel_size=(5, 5), stride=(1, 1))
 (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
 (conv2): Conv2d(28, 56, kernel_size=(5, 5), stride=(1, 1))
 (fc1): Linear(in_features=896, out_features=1024, bias=True)
 (fc2): Linear(in_features=1024, out_features=10, bias=True)
)

Training primary code

for epoch in range(50):
  for images, labels in trainloader:
    images = (device)
    labels = (device)
    pre_label = net(images)
    loss = F.cross_entropy(input=pre_label, target=labels).mean()
    pre_label = (pre_label, dim=1)
    acc = (pre_label==labels).sum()/(()[0], dtype=torch.float32)
    net.zero_grad()
    ()
    ()
  print(().cpu().numpy(), ().cpu().numpy())

F.cross_entropycross-entropy function

在这里插入图片描述

The source code has helped us implement thesoftmaxTherefore it is not necessary to carry out your ownsoftmaxDid it.
Calculate the index value where the largest number is located

acc = (pre_label==labels).sum()/(()[0], dtype=torch.float32)
# pre_label==labels the same dimension to compare the same return True different return False, True is 1 False is 0, that is, you can get the number of equal, and then divided by the total number, you get the Accuracy accuracy

anticipate

testset = (root='./data', train=False, download=True, transform=transform)
testloader = (testset, batch_size=128, shuffle=True,num_workers=2)
images, labels = iter(testloader).next()
images = (device)
labels = (device)
with torch.no_grad():
  pre_label = net(images)
  pre_label = (pre_label, dim=1)
  acc = (pre_label==labels).sum()/(()[0], dtype=torch.float32)
  print(acc)

summarize

In this section we learn about theStandardized data -Principle of ConvolutionThe short answer builds a networkand get it to recognize handwriting is also a generalization of the previous chapters

This article on the ultra-detailed PyTorch handwritten number recognizer sample code is introduced to this article, more related PyTorch handwritten number recognizer content, please search for my previous posts or continue to browse the following related articles I hope you will support me in the future!