模型转换

  • pytorch转onnx
import torch
torch_model = torch.load("save.pt") # pytorch模型加载
batch_size = 1  #批处理大小
input_shape = (3,244,244)   #输入数据
 
# set the model to inference mode
torch_model.eval()
 
x = torch.randn(batch_size,*input_shape)        # 生成张量
export_onnx_file = "test.onnx"                  # 目的ONNX文件名
torch.onnx.export(torch_model,
                    x,
                    export_onnx_file,
                    opset_version=10,
                    do_constant_folding=True,   # 是否执行常量折叠优化
                    input_names=["input"],      # 输入名
                    output_names=["output"],    # 输出名
                    dynamic_axes={"input":{0:"batch_size"},     # 批处理变量
                                    "output":{0:"batch_size"}})
  • tensorflow,pb模型转uff再转engine

pb模型转uff

首先安装convert-to-uff: apt install uff-converter-tf
执行:python3 /usr/local/bin/convert-to-uff --help
输出:
 
Converts TensorFlow models to Unified Framework Format (UFF).
 
positional arguments:
  input_file            path to input model (protobuf file of frozen GraphDef)
 
optional arguments:
  -h, --help            show this help message and exit
  -l, --list-nodes      show list of nodes contained in input file
  -t, --text            write a text version of the output in addition to the
                        binary
  --write_preprocessed  write the preprocessed protobuf in addition to the
                        binary
  -q, --quiet           disable log messages
  -d, --debug           Enables debug mode to provide helpful debugging output
  -o OUTPUT, --output OUTPUT
                        name of output uff file
  -O OUTPUT_NODE, --output-node OUTPUT_NODE
                        name of output nodes of the model
  -I INPUT_NODE, --input-node INPUT_NODE
                        name of a node to replace with an input to the model.
                        Must be specified as:
                        "name,new_name,dtype,dim1,dim2,..."
  -p PREPROCESSOR, --preprocessor PREPROCESSOR
                        the preprocessing file to run before handling the
                        graph. This file must define a `preprocess` function
                        that accepts a GraphSurgeon DynamicGraph as it's
                        input. All transformations should happen in place on
                        the graph, as return values are discarded
转换过程:
python3 /usr/local/bin/convert-to-uff model.pb -o model.uff -O softmax/Softmax -I input_1,input_1,float32,1,3,224,224

uff再转engine

执行:/usr/src/tensorrt/bin/trtexec --help
输出:
 
=== Model Options ===
  --uff=<file>                UFF model
  --onnx=<file>               ONNX model
  --model=<file>              Caffe model (default = no model, random weights used)
  --deploy=<file>             Caffe prototxt file
  --output=<name>[,<name>]*   Output names (it can be specified multiple times); at least one output is required for UFF and Caffe
  --uffInput=<name>,X,Y,Z     Input blob name and its dimensions (X,Y,Z=C,H,W), it can be specified multiple times; at least one is required for UFF models
  --uffNHWC                   Set if inputs are in the NHWC layout instead of NCHW (use X,Y,Z=H,W,C order in --uffInput)
 
=== Build Options ===
  --maxBatch                  Set max batch size and build an implicit batch engine (default = 1)
  --explicitBatch             Use explicit batch sizes when building the engine (default = implicit)
  --minShapes=spec            Build with dynamic shapes using a profile with the min shapes provided
  --optShapes=spec            Build with dynamic shapes using a profile with the opt shapes provided
  --maxShapes=spec            Build with dynamic shapes using a profile with the max shapes provided
                              Note: if any of min/max/opt is missing, the profile will be completed using the shapes
                                    provided and assuming that opt will be equal to max unless they are both specified;
                                    partially specified shapes are applied starting from the batch size;
                                    dynamic shapes imply explicit batch
                                    input names can be wrapped with single quotes (ex: 'Input:0')
                              Input shapes spec ::= Ishp[","spec]
                                           Ishp ::= name":"shape
                                          shape ::= N[["x"N]*"*"]
  --inputIOFormats=spec       Type and formats of the input tensors (default = all inputs in fp32:chw)
  --outputIOFormats=spec      Type and formats of the output tensors (default = all outputs in fp32:chw)
                              IO Formats: spec  ::= IOfmt[","spec]
                                          IOfmt ::= type:fmt
                                          type  ::= "fp32"|"fp16"|"int32"|"int8"
                                          fmt   ::= ("chw"|"chw2"|"chw4"|"hwc8"|"chw16"|"chw32")["+"fmt]
  --workspace=N               Set workspace size in megabytes (default = 16)
  --minTiming=M               Set the minimum number of iterations used in kernel selection (default = 1)
  --avgTiming=M               Set the number of times averaged in each iteration for kernel selection (default = 8)
  --fp16                      Enable fp16 algorithms, in addition to fp32 (default = disabled)
  --int8                      Enable int8 algorithms, in addition to fp32 (default = disabled)
  --calib=<file>              Read INT8 calibration cache file
  --safe                      Only test the functionality available in safety restricted flows
  --saveEngine=<file>         Save the serialized engine
  --loadEngine=<file>         Load a serialized engine
 
=== Inference Options ===
  --batch=N                   Set batch size for implicit batch engines (default = 1)
  --shapes=spec               Set input shapes for dynamic shapes inputs. Input names can be wrapped with single quotes(ex: 'Input:0')
                              Input shapes spec ::= Ishp[","spec]
                                           Ishp ::= name":"shape
                                          shape ::= N[["x"N]*"*"]
  --loadInputs=spec           Load input values from files (default = generate random inputs). Input names can be wrapped with single quotes (ex: 'Input:0')
                              Input values spec ::= Ival[","spec]
                                           Ival ::= name":"file
  --iterations=N              Run at least N inference iterations (default = 10)
  --warmUp=N                  Run for N milliseconds to warmup before measuring performance (default = 200)
  --duration=N                Run performance measurements for at least N seconds wallclock time (default = 3)
  --sleepTime=N               Delay inference start with a gap of N milliseconds between launch and compute (default = 0)
  --streams=N                 Instantiate N engines to use concurrently (default = 1)
  --exposeDMA                 Serialize DMA transfers to and from device. (default = disabled)
  --useSpinWait               Actively synchronize on GPU events. This option may decrease synchronization time but increase CPU usage and power (default = disabled)
  --threads                   Enable multithreading to drive engines with independent threads (default = disabled)
  --useCudaGraph              Use cuda graph to capture engine execution and then launch inference (default = disabled)
  --buildOnly                 Skip inference perf measurement (default = disabled)
 
=== Build and Inference Batch Options ===
                              When using implicit batch, the max batch size of the engine, if not given,
                              is set to the inference batch size;
                              when using explicit batch, if shapes are specified only for inference, they
                              will be used also as min/opt/max in the build profile; if shapes are
                              specified only for the build, the opt shapes will be used also for inference;
                              if both are specified, they must be compatible; and if explicit batch is
                              enabled but neither is specified, the model must provide complete static
                              dimensions, including batch size, for all inputs
 
=== Reporting Options ===
  --verbose                   Use verbose logging (default = false)
  --avgRuns=N                 Report performance measurements averaged over N consecutive iterations (default = 10)
  --percentile=P              Report performance for the P percentage (0<=P<=100, 0 representing max perf, and 100 representing min perf; (default = 99%)
  --dumpOutput                Print the output tensor(s) of the last inference iteration (default = disabled)
  --dumpProfile               Print profile information per layer (default = disabled)
  --exportTimes=<file>        Write the timing results in a json file (default = disabled)
  --exportOutput=<file>       Write the output tensors to a json file (default = disabled)
  --exportProfile=<file>      Write the profile information per layer in a json file (default = disabled)
 
=== System Options ===
  --device=N                  Select cuda device N (default = 0)
  --useDLACore=N              Select DLA core N for layers that support DLA (default = none)
  --allowGPUFallback          When DLA is enabled, allow GPU fallback for unsupported layers (default = disabled)
  --plugins                   Plugin library (.so) to load (can be specified multiple times)
 
=== Help ===
  --help                      Print this message
 
转换过程:
/usr/src/tensorrt/bin/trtexec --uff=/home/model/model.uff --uffInput=input_1,1,3,224,224 --output=softmax/Softmax --saveEngine=/home/model/model.engine --outputIOFormats=fp32:chw --buildOnly --useCudaGraph
  • tensorflow,pb模型转onnx再转engine

pb模型转onnx

第一步安装tf2onnx:pip install -U tf2onnx
执行:python3 -m tf2onnx.convert --help 查看使用方式
输出:
usage: convert.py [-h] [--input INPUT] [--graphdef GRAPHDEF]
                  [--saved-model SAVED_MODEL] [--tag TAG]
                  [--signature_def SIGNATURE_DEF]
                  [--concrete_function CONCRETE_FUNCTION]
                  [--checkpoint CHECKPOINT] [--keras KERAS] [--large_model]
                  [--output OUTPUT] [--inputs INPUTS] [--outputs OUTPUTS]
                  [--opset OPSET] [--custom-ops CUSTOM_OPS]
                  [--extra_opset EXTRA_OPSET] [--target {rs4,rs5,rs6,caffe2}]
                  [--continue_on_error] [--verbose] [--debug]
                  [--output_frozen_graph OUTPUT_FROZEN_GRAPH] [--fold_const]
                  [--inputs-as-nchw INPUTS_AS_NCHW]
 
Convert tensorflow graphs to ONNX.
 
optional arguments:
  -h, --help            show this help message and exit
  --input INPUT         input from graphdef
  --graphdef GRAPHDEF   input from graphdef
  --saved-model SAVED_MODEL
                        input from saved model
  --tag TAG             tag to use for saved_model
  --signature_def SIGNATURE_DEF
                        signature_def from saved_model to use
  --concrete_function CONCRETE_FUNCTION
                        For TF2.x saved_model, index of func signature in
                        __call__ (--signature_def is ignored)
  --checkpoint CHECKPOINT
                        input from checkpoint
  --keras KERAS         input from keras model
  --large_model         use the large model format (for models > 2GB)
  --output OUTPUT       output model file
  --inputs INPUTS       model input_names
  --outputs OUTPUTS     model output_names
  --opset OPSET         opset version to use for onnx domain
  --custom-ops CUSTOM_OPS
                        list of custom ops
  --extra_opset EXTRA_OPSET
                        extra opset with format like domain:version, e.g.
                        com.microsoft:1
  --target {rs4,rs5,rs6,caffe2}
                        target platform
  --continue_on_error   continue_on_error
  --verbose, -v         verbose output, option is additive
  --debug               debug mode
  --output_frozen_graph OUTPUT_FROZEN_GRAPH
                        output frozen tf graph to file
  --fold_const          Deprecated. Constant folding is always enabled.
  --inputs-as-nchw INPUTS_AS_NCHW
                        transpose inputs as from nhwc to nchw
 
Usage Examples:
 
python -m tf2onnx.convert --saved-model saved_model_dir --output model.onnx
python -m tf2onnx.convert --input frozen_graph.pb  --inputs X:0 --outputs output:0 --output model.onnx
python -m tf2onnx.convert --checkpoint checkpoint.meta  --inputs X:0 --outputs output:0 --output model.onnx
 
For help and additional information see:
    https://github.com/onnx/tensorflow-onnx
 
If you run into issues, open an issue here:
    https://github.com/onnx/tensorflow-onnx/issues
 
转换成onnx
python3 -m tf2onnx.convert --input model.pb --inputs input_1:0 --outputs softmax/Softmax:0 --inputs-as-nchw input_1:0 --output model.onnx --opset 13

onnx转engine

onnx转engine(这是动态输入时的转换方式)
/usr/src/tensorrt/bin/trtexec --onnx=/home/model/model.onnx  --explicitBatch --minShapes=\'input_1:0\':1x3x224x224,\'softmax/Softmax:0\':1x3 --optShapes=\'input_1:0\':1x3x224x224,\'softmax/Softmax:0\':1x3 --maxShapes=\'input_1:0\':1x3x224x224,\'softmax/Softmax:0\':1x3 --shapes=\'input_1:0\':1x3x224x224,\'softmax/Softmax:0\':1x3 --inputIOFormats=fp32:chw --outputIOFormats=fp32:chw --saveEngine=/home/model/model.engine  --buildOnly --useCudaGraph
 
如果是固定输入则去除--explicitBatch --minShapes=\'input_1:0\':1x3x224x224,\'softmax/Softmax:0\':1x3 --optShapes=\'input_1:0\':1x3x224x224,\'softmax/Softmax:0\':1x3 --maxShapes=\'input_1:0\':1x3x224x224,\'softmax/Softmax:0\':1x3 --shapes=\'input_1:0\':1x3x224x224,\'softmax/Softmax:0\':1x3,然后增加--batch batch_size,这里的batch_size是tensorflow模型的batch_size

 

上一篇:使用Relay部署编译ONNX模型


下一篇:ONNX再探