synopsis
EdgeCNN is a Convolutional Neural Network (CNN) model for image point cloud processing. Unlike traditional CNNs that can only process image 2D data, EdgeCNN can operate on the local neighborhood around each point in a 3D point cloud, and is applicable to a number of tasks such as object recognition, depth estimation, and autonomous driving.
2. Steps towards realization
2.1 Data preparation
In this experiment, we use a dataset ModelNet10 containing 40,000 point clouds, as an example. Unlike other standard image datasets, the composition of the graphs in this dataset is very large and the structure of the graphs varies greatly from one to another, thus requiring extensive preprocessing work.
# Import model datasets from torch_geometric.datasets import ModelNet # Load ModelNet dataset dataset = ModelNet(root='./modelnet', name='10') data = dataset[0] # Define hyperparameters num_points = 1024 batch_size = 32 train_dataset_size = 8000 # Split the dataset into training, validation and testing datasets train_dataset = data[0:train_dataset_size] val_dataset = data[train_dataset_size: 9000] test_dataset = data[9000:] # Define the data loading batch processor train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False) test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
With the above code, we first imported the ModelNet dataset and partitioned it into three datasets, training, validation and testing, and created a data loading batch processor to facilitate efficient processing of this data during training.
2.2 Realization model
When defining the EdgeCNN model, we need to define the network structure according to the architecture often used for image point clouds. At the same time, the corresponding neighborhood information should be introduced when implementing the convolution operation to enable the network to learn the relationship between nearby points in the system.
from import Sequential as Seq, Linear as Lin, ReLU from torch_geometric.nn import EdgeConv, global_max_pool class EdgeCNN(): def __init__(self, dataset): super(EdgeCNN, self).__init__() # Define base parameters self.input_dim = dataset.num_features self.output_dim = dataset.num_classes self.num_points = num_points # Define the model structure self.conv1 = EdgeConv(Seq(Lin(self.input_dim, 32), ReLU())) self.conv2 = EdgeConv(Seq(Lin(32, 64), ReLU())) self.conv3 = EdgeConv(Seq(Lin(64, 128), ReLU())) self.conv4 = EdgeConv(Seq(Lin(128, 256), ReLU())) self.fc1 = (256, 1024) self.fc2 = (1024, self.output_dim) def forward(self, pos, batch): # Tectonic diagrams edge_index = radius_graph(pos, r=0.6, batch=batch, loop=False) # Convolution + pooling of first layer CNN models x = (self.conv1(x=pos, edge_index=edge_index)) x = global_max_pool(x, batch) # Convolution + Pooling for Layer 2 CNN Models edge_index = radius_graph(x, r=0.9, batch=batch, loop=False) x = (self.conv2(x=x, edge_index=edge_index)) x = global_max_pool(x, batch) # Convolution + Pooling for Layer 3 CNN Models edge_index = radius_graph(x, r=1.2, batch=batch, loop=False) x = (self.conv3(x=x, edge_index=edge_index)) x = global_max_pool(x, batch) # Convolution + Pooling for Layer 4 CNN Models edge_index = radius_graph(x, r=1.5, batch=batch, loop=False) x = (self.conv4(x=x, edge_index=edge_index)) # Define fully connected networks x = global_max_pool(x, batch) x = (self.fc1(x)) x = self.fc2(x) return F.log_softmax(x, dim=-1)
In the above code, the individual convolutional and fully-connected layers of the EdgeCNN-based model are implemented with theradius_graph
Equivalent functions generalize the local region problem to the defined convolutional kernel detection range for better point analysis and feature extraction. Finally a vector with dimension of the number of categories is output in combination with the fully connected layer and the loss is computed by means of the softmax function.
2.3 Model training
After defining the EdgeCNN network structure, we also need to specify the appropriate optimizer, loss function, and control the hyperparameters such as the number of training rounds, batch size and learning rate. We also need to record a large amount of log information for later tracking and management.
# Define the training plan, including loss function, optimizer, and number of iterations, etc. train_epochs = 50 learning_rate = 0.01 criterion = () optimizer = (edge_cnn.parameters(), lr=learning_rate) losses_per_epoch = [] accuracies_per_epoch = [] for epoch in range(train_epochs): running_loss = 0.0 running_corrects = 0.0 count = 0.0 for samples in train_loader: optimizer.zero_grad() pos, batch, label = , , () out = edge_cnn(pos, batch) loss = criterion(out, label) () () running_loss += () / len(train_dataset) running_corrects += ((out, dim=1) == label).item() / len(train_dataset) count += 1 losses_per_epoch.append(running_loss) accuracies_per_epoch.append(running_corrects) if (epoch + 1) % 5 == 0: print("Train Epoch {}/{} Loss {:.4f} Accuracy {:.4f}".format( epoch + 1, train_epochs, running_loss, running_corrects))
During the training process, we traverse each batch, optimize it by backpropagation algorithm, and update the loss and accuracy output. At the same time, in order to facilitate visualization and recording, the loss and accuracy outputs during the training process need to be output to the corresponding containers for later analysis and processing.
Above is Pytorch+PyG to realize EdgeCNN process example details, more information about Pytorch PyG to realize EdgeCNN please pay attention to my other related articles!