preamble
The Open Neural Network Exchange (ONNX, Open Neural Network Exchange) format, a standard for representing deep learning models, allows models to be transferred between different frameworks
The model defined by PyTorch is a dynamic graph whose forward propagation is defined and implemented by the class method
However, Python code is inefficient. Imagine converting a dynamic graph to a static graph, the inference speed of the model should be improved.
In the PyTorch framework, you can export a model with a parent class of onnx to an onnx file.
There are three most important parameters:
- model:A model whose parent class is
- args:The list of variables passed into the forward method of the model should be of type
- tuplef:onnx String of file names
import torch from import resnet50 file = '' # Declare the model resnet = resnet50(pretrained=False).eval() image = ([1, 3, 224, 224]) # Export as onnx file (resnet, (image,), file)
The onnx file can be opened by Netron to view the model structure
basic usage
To run onnx models in Python, you need to download onnxruntime
# One or the other will do pip install onnxruntime # CPU version pip install onnxruntime-gpu # GPU releases
Reasoning is done with the help of one of the InferenceSession, of which the more important instance methods are:
- get_inputs():Get a list of input variables (variable attributes: name, shape, type).
- get_outputs():get list of input variables (variable attributes: name, shape, type) run(output_names, input_feed): input variables are (note that dtype should be float32), use model inference and return output
The basic usage of the onnx model can be derived:
import onnxruntime as ort import numpy as np file = '' # Find GPU / CPU provider = ort.get_available_providers()[ 1 if ort.get_device() == 'GPU' else 0] print('Equipment:', provider) # Declare the onnx model model = (file, providers=[provider]) # Reference. for node_list in model.get_inputs(), model.get_outputs(): for node in node_list: attr = {'name': , 'shape': , 'type': } print(attr) print('-' * 60) # Get the names of the input and output nodes input_node_name = model.get_inputs()[0].name ouput_node_name = [ for node in model.get_outputs()] image = ([1, 3, 224, 224]).astype(np.float32) print((output_names=ouput_node_name, input_feed={input_node_name: image}))
Advanced API
To simplify the usage steps, encapsulation is done using classes:
class Onnx_Module(): ''' onnx inference model provider: prioritize GPUs ''' provider = ort.get_available_providers()[ 1 if ort.get_device() == 'GPU' else 0] def __init__(self, file): super(Onnx_Module, self).__init__(file, providers=[]) # Reference. = [node_arg.name for node_arg in self.get_inputs()] = [node_arg.name for node_arg in self.get_outputs()] def __call__(self, *arrays): input_feed = {name: x for name, x in zip(, arrays)} return (, input_feed)
In PyTorch, for a convolutional neural network model with an image image, the code for inference is "model(image)", and the class using this wrapper is similar:
import numpy as np file = '' model = Onnx_Module(file) image = ([1, 3, 224, 224]).astype(np.float32) print(model(image))
To make it easier to observe the speed difference between the Torch model and the onnx model, and to check whether the outputs of the two models are consistent, the test function has been written again.
The parameters of the test method are the same as those of the test method, and the basic procedure is as follows:
- Get the output of the Torch model and print to infer the elapsed time
- Export the Torch model as an onnx file, transforming the input variables in the
- Initialize the onnx model, get the output of the onnx model, and print the inferred elapsed time
- Calculate the mean of the absolute errors of the Torch model and the onnx model outputs
- Return the onnx model
class Timer: repeat = 3 def __new__(cls, fun, *args, **kwargs): import time start = () for _ in range(): fun(*args, **kwargs) cost = (() - start) / return cost * 1e3 # ms class Onnx_Module(): ''' onnx inference model provider: prioritize GPUs ''' provider = ort.get_available_providers()[ 1 if ort.get_device() == 'GPU' else 0] def __init__(self, file): super(Onnx_Module, self).__init__(file, providers=[]) # Reference. = [node_arg.name for node_arg in self.get_inputs()] = [node_arg.name for node_arg in self.get_outputs()] def __call__(self, *arrays): input_feed = {name: x for name, x in zip(, arrays)} return (, input_feed) @classmethod def test(cls, model, args, file, **export_kwargs): # Test Torch's runtime torch_output = model(*args).() print(f'Torch: {Timer(model, *args):.2f} ms') # model: Torch -> onnx (model, args, file, **export_kwargs) # data: tensor -> array args = tuple(map(lambda tensor: (), args)) onnx_model = cls(file) # Test onnx runtime onnx_output = onnx_model(*args) print(f'Onnx: {Timer(onnx_model, *args):.2f} ms') # Calculate the absolute error between the Torch model and the onnx model outputs abs_error = (torch_output - onnx_output).mean() print(f'Mean Error: {abs_error:.2f}') return onnx_model
For ResNet50 , the Torch model takes 172.67 ms to infer and the onnx model takes 36.56 ms to infer, and the onnx model takes only 21.17% as long as the Torch model.
To this point this article on the PyTorch model onnx file export and call details of the article is introduced to this, more related to PyTorch file export content, please search for my previous articles or continue to browse the following related articles I hope you will support me more in the future!