SoFunction
Updated on 2024-11-16

Pytorch+PyG to realize the GraphSAGE process example details

Introduction to GraphSAGE

GraphSAGE (Graph Sampling and Aggregation) is a common graph neural network model mainly used for node-level representation learning. The model fuses the information of a node and its neighboring nodes together to obtain its representation representation based on a sampling and aggregation strategy, and improves the accuracy of the representation through multiple rounds of iterative updates.

Implementation steps

Data preparation

In this implementation, we still use the Cora dataset as an example for testing, and since GraphSage focuses mainly on updating single node features, no special handling of the dataset is needed here, just converting the data into PyG format.

import  as F
from torch_geometric.datasets import Planetoid
from torch_geometric.utils import from_networkx, to_networkx
# Load cora dataset
dataset = Planetoid(root='./cora', name='Cora')
data = dataset[0]
# Convert formal diagrams into the format needed by PyG
graph = to_networkx(data)
data = from_networkx(graph)
# Get the number of nodes and feature vector dimensions
num_nodes = data.num_nodes
num_features = dataset.num_features
num_classes = dataset.num_classes
# Create a node segmentation dataset that needs to be trained
data.train_mask = (num_nodes, dtype=)
data.val_mask = (num_nodes, dtype=)
data.test_mask = (num_nodes, dtype=)
data.train_mask[:num_nodes - 1000] = True
data.test_mask[-1000:] = True
data.val_mask[num_nodes - 2000: num_nodes - 1000] = True

implementation model

Next, we need to define the GraphSAGE model. Unlike traditional GCNs, which require only one layer of convolutional operations, GraphSAGE contains two layers of convolutional and sampling (also known as "aggregation") operations.

from  import Sequential as Seq, Linear as Lin, ReLU
from torch_geometric.nn import SAGEConv
class GraphSAGE():
    def __init__(self, hidden_channels, num_layers):
        super(GraphSAGE, self).__init__()
         = ()
        for i in range(num_layers):
            in_channels = hidden_channels if i != 0 else num_features
            out_channels = num_classes if i == num_layers - 1 else hidden_channels
            (SAGEConv(in_channels, out_channels))
    def forward(self, x, edge_index):
        for _, conv in enumerate([:-1]):
            x = (conv(x, edge_index))
        # The last layer doesn't use an activation function
        x = [-1](x, edge_index)
        return F.log_softmax(x, dim=-1)

In the above code, we implemented multilayer GraphSAGE convolution and the corresponding aggregation function, and used ReLU and softmax functions for feature extraction and output of classification scores.

model training

After defining the model, we can start to train the model for Cora dataset. First of all, you still need to specify the optimizer and loss function first, and set some parameters for recording the information during the training process, such as Epochs, Batch size, learning rate, and so on.

# Initialize GraphSage and specify parameters
num_layers = 2
hidden_channels = 256
model = GraphSAGE(hidden_channels, num_layers).to(device)
optimizer = ((), lr=0.01)
loss_func = ()
# Training process
for epoch in range(500):
    ()
    optimizer.zero_grad()
    out = model((device), data.edge_index.to(device))
    loss = loss_func(out[data.train_mask], (device)[data.train_mask])
    ()
    ()
    # Check the accuracy at each test stage
    if epoch % 10 == 0:
        with torch.no_grad():
            _, pred = model((device), data.edge_index.to(device)).max(dim=1)
            correct = float(pred[data.test_mask].eq((device)[data.test_mask]).sum().item())
            acc = correct / data.test_mask.sum().item()
            print("Epoch {:03d}, Train Loss {:.4f}, Test Acc {:.4f}".format(
                epoch, (), acc))

In the above code, we fit the GraphSAGE model using labeled training data, test the accuracy at various validation stages, and optimize the loss function by gradient descent.

Above is Pytorch+PyG to achieve GraphSAGE process example details, more information about Pytorch PyG to achieve GraphSAGE please pay attention to my other related articles!