Jetson TX2实现EfficientDet推理加速(二)




  • infer推理错误

    [TensorRT] ERROR: 2: [pluginV2DynamicExtRunner.cpp::execute::115] Error Code 2: Internal Error (Assertion status == kSTATUS_SUCCESS failed.)
  • 直接用pip安装pip install onnx_graphsurgeon报错

    pip install nvidia-pyindex
    pip install onnx-graphsurgeon
  • 生成onnx过程中,不支持

    INFO:EfficientDetGraphSurgeon:Created NMS plugin 'EfficientNMS_TRT' with attributes: {'plugin_version': '1', 'background_class': -1, 'max_output_boxes': 100, 'score_threshold': 0.4000000059604645, 'iou_threshold': 0.5, 'score_activation': True, 'box_coding': 1}
    Warning: Unsupported operator EfficientNMS_TRT. No schema registered for this operator.
    Warning: Unsupported operator EfficientNMS_TRT. No schema registered for this operator.
    Warning: Unsupported operator EfficientNMS_TRT. No schema registered for this operator.
  • 安装dm-tree失败
    unable to execute ‘bazel’: No such file or directory #1089

    Failed to build dm-tree
    Installing collected packages: dm-tree
        Running install for dm-tree ... error
    CMAKE_SOURCE_DIR = /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/tree/tree
    CMAKE_BINARY_DIR = /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/build_tree
    Current build type is: RELEASE
    PROJECT_BINARY_DIR is: /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/build_tree
    pybind11 v2.6.2 
    Configuring done
    Generating done
  • 源码编译安装方法一出错

    /usr/bin/ld: cannot open output file tree/ No such file or directory
    collect2: error: ld returned 1 exit status
    CMakeFiles/_tree.dir/build.make:101: recipe for target 'tree/' failed
    make[2]: *** [tree/] Error 1
    CMakeFiles/Makefile2:127: recipe for target 'CMakeFiles/_tree.dir/all' failed
    make[1]: *** [CMakeFiles/_tree.dir/all] Error 2
    Makefile:90: recipe for target 'all' failed
    make: *** [all] Error 2
    pip install -r /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/tree/docs/requirements.txt
    python install
  • 安装tensorflow-model-optimization失败

    Failed to build dm-tree
    Installing collected packages: dm-tree, tensorflow-model-optimization
        Running install for dm-tree ... error
    安装好 dm-tree,即可顺利安装 tensorflow-model-optimization
  • 安装bazel失败
    Install Tensorflow Object Detection API for

    The error complains about a missing binary called bazel.
    You can install it via building from the source
    # Reference:
    set -e
    mkdir -p $folder
    echo "** Install requirements"
    sudo apt-get install -y pkg-config zip g++ zlib1g-dev unzip
    sudo apt-get install -y openjdk-8-jdk
    echo "** Download bazel-3.1.0 sources"
    pushd $folder
    if [ ! -f ]; then
    echo "** Build and install bazel-3.1.0"
  • 在GTX 1650服务器中运行的环境,直接pip安装到Jetson TX2失败,部分包无法安装

    pip install -r requirements-gpu.txt
    删去requirements-gpu.txt文件中所有包的版本号,默认安装与Jetson TX2匹配的最新版本
  • 创建virtualenv虚拟环境失败

    tx2@tx2:/media/mydisk/MyDocuments/PyProjects/automl/efficientdet$ virtualenv -p /usr/bin/python3 venv
    Already using interpreter /usr/bin/python3
    Using base prefix '/usr'
    New python executable in /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/bin/python3
    Also creating executable in /media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/bin/python
    Installing setuptools, pkg_resources, pip, wheel...
      Complete output from command /media/mydisk/MyDocu...det/venv/bin/python3 - setuptools pkg_resources pip wheel:
    Traceback (most recent call last):
      File "/usr/share/python-wheels/pip-9.0.1-py2.py3-none-any.whl/pip/", line 215, in main
        status =, args)
      File "/usr/share/python-wheels/pip-9.0.1-py2.py3-none-any.whl/pip/commands/", line 290, in run
        with self._build_session(options) as session:
      File "/usr/share/python-wheels/pip-9.0.1-py2.py3-none-any.whl/pip/", line 69, in _build_session
        if options.cache_dir else None
      File "/media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/", line 80, in join
        a = os.fspath(a)
    TypeError: expected str, bytes or os.PathLike object, not int
    ...Installing setuptools, pkg_resources, pip, wheel...done.
    Traceback (most recent call last):
      File "/usr/bin/virtualenv", line 11, in <module>
        load_entry_point('virtualenv==15.1.0', 'console_scripts', 'virtualenv')()
      File "/usr/lib/python3/dist-packages/", line 724, in main
      File "/usr/lib/python3/dist-packages/", line 992, in create_environment
      File "/usr/lib/python3/dist-packages/", line 922, in install_wheel
        call_subprocess(cmd, show_stdout=False, extra_env=env, stdin=SCRIPT)
      File "/usr/lib/python3/dist-packages/", line 817, in call_subprocess
        % (cmd_desc, proc.returncode))
    OSError: Command /media/mydisk/MyDocu...det/venv/bin/python3 - setuptools pkg_resources pip wheel failed with error code 2
    Traceback (most recent call last):
      File "/usr/share/python-wheels/pip-9.0.1-py2.py3-none-any.whl/pip/", line 215, in main
        status =, args)
      File "/usr/share/python-wheels/pip-9.0.1-py2.py3-none-any.whl/pip/commands/", line 290, in run
        with self._build_session(options) as session:
      File "/usr/share/python-wheels/pip-9.0.1-py2.py3-none-any.whl/pip/", line 69, in _build_session
    virtualenv -p /usr/bin/python3 venv
  • 生成FP32引擎成功,但生成FP16引擎失败

    [TensorRT] ERROR: 2: [pluginV2DynamicExtRunner.cpp::execute::115] Error Code 2: Internal Error (Assertion status == kSTATUS_SUCCESS failed.)
    Traceback (most recent call last):
      File "", line 240, in <module>
      File "", line 212, in main
      File "", line 203, in create_engine
        with self.builder.build_engine(, self.config) as engine, open(engine_path, "wb") as f:
    AttributeError: __enter__
    [EfficientNMS_TRT not working on jetson nano (TensorRT 8.0.1) #1538](
    This problem did not occur if BatchedNMS_TRT was used instead of EfficientNMS_TRT by giving the --legacy_plugins option when creating the onnx file in
    What's even more strange is that it was built without any problems at Jetson Xavier NX. (same Jetpack, tensorrt version).
    有人尝试,在Jetson TX2中会出现这个问题,但是在Jetson Xavier NX没有任何问题。
    生成onnx的时候,添加 `--legacy_plugins` 参数
    python \
        --input_shape '1,512,512,3' \
        --saved_model /media/mydisk/YOYOFile/saved_model \
        --onnx /media/mydisk/YOYOFile/saved_model_onnx/model.onnx \
  • 如果无法跟踪tensorRT错误信息

    [builder.build_engine throws AttributeError: __enter__ #234](
    trt.Logger.ERROR 改为 trt.Logger.VERBOSE
    et the TRT_LOGGER's verbosity to VERBOSE: TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE)
  • 显存不足

    [TensorRT] ERROR: Tactic Device request: 1686MB Available: 1536MB. Device memory is insufficient to use tactic.
    Jetson TX2提示现存不足的ERROR,但是程序并不会终止,可以推测Jetson TX2内部自动进行内存/显存优化,防止因为显存不够的问题导致程序终止。
    (venv) tx2@tx2:/media/mydisk/MyDocuments/PyProjects/TensorRT/samples/python/efficientdet$ time python \
    >     --engine /media/mydisk/YOYOFile/saved_model_trt_fp16/engine.trt \
    >     --saved_model /media/mydisk/YOYOFile/saved_model \
    >     --input /media/mydisk/YOYOFile/coco_calib \
    >     --output /media/mydisk/YOYOFile/output_fp16
    2021-10-22 15:35:22.133357: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library
    2021-10-22 15:35:34.777079: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library
    2021-10-22 15:35:34.777723: I tensorflow/stream_executor/cuda/] ARM64 does not support NUMA - returning NUMA node zero
    2021-10-22 15:35:34.777983: I tensorflow/core/common_runtime/gpu/] Found device 0 with properties: 
    pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
    coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
    2021-10-22 15:35:34.778194: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library
    2021-10-22 15:35:34.778427: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library
    2021-10-22 15:35:34.778583: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library
    2021-10-22 15:35:34.778749: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library
    2021-10-22 15:35:34.779183: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library
    2021-10-22 15:35:34.825369: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library
    2021-10-22 15:35:34.861433: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library
    2021-10-22 15:35:34.861805: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library
    2021-10-22 15:35:34.862251: I tensorflow/stream_executor/cuda/] ARM64 does not support NUMA - returning NUMA node zero
    2021-10-22 15:35:34.862703: I tensorflow/stream_executor/cuda/] ARM64 does not support NUMA - returning NUMA node zero
    2021-10-22 15:35:34.862908: I tensorflow/core/common_runtime/gpu/] Adding visible gpu devices: 0
    2021-10-22 15:37:02.440933: I tensorflow/stream_executor/cuda/] ARM64 does not support NUMA - returning NUMA node zero
    2021-10-22 15:37:02.441206: I tensorflow/core/common_runtime/gpu/] Found device 0 with properties: 
    pciBusID: 0000:00:00.0 name: NVIDIA Tegra X2 computeCapability: 6.2
    coreClock: 1.3GHz coreCount: 2 deviceMemorySize: 7.67GiB deviceMemoryBandwidth: 38.74GiB/s
    2021-10-22 15:37:02.441661: I tensorflow/stream_executor/cuda/] ARM64 does not support NUMA - returning NUMA node zero
    2021-10-22 15:37:02.442112: I tensorflow/stream_executor/cuda/] ARM64 does not support NUMA - returning NUMA node zero
    2021-10-22 15:37:02.442278: I tensorflow/core/common_runtime/gpu/] Adding visible gpu devices: 0
    2021-10-22 15:37:02.442651: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library
    2021-10-22 15:37:09.339992: I tensorflow/core/common_runtime/gpu/] Device interconnect StreamExecutor with strength 1 edge matrix:
    2021-10-22 15:37:09.340386: I tensorflow/core/common_runtime/gpu/]      0 
    2021-10-22 15:37:09.340484: I tensorflow/core/common_runtime/gpu/] 0:   N 
    2021-10-22 15:37:09.341206: I tensorflow/stream_executor/cuda/] ARM64 does not support NUMA - returning NUMA node zero
    2021-10-22 15:37:09.341823: I tensorflow/stream_executor/cuda/] ARM64 does not support NUMA - returning NUMA node zero
    2021-10-22 15:37:09.342411: I tensorflow/stream_executor/cuda/] ARM64 does not support NUMA - returning NUMA node zero
    2021-10-22 15:37:09.342745: I tensorflow/core/common_runtime/gpu/] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 80 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
    2021-10-22 15:40:55.306220: I tensorflow/compiler/mlir/] None of the MLIR Optimization Passes are enabled (registered 2)
    2021-10-22 15:40:55.546753: I tensorflow/core/platform/profile_utils/] CPU Frequency: 31250000 Hz
    len(batch_images): ['/media/mydisk/YOYOFile/coco_calib/COCO_train2014_000000000009.jpg']
    2021-10-22 15:42:32.819948: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library
    2021-10-22 15:42:33.464673: I tensorflow/stream_executor/cuda/] Loaded cuDNN version 8201
    2021-10-22 15:42:33.958547: W tensorflow/core/common_runtime/] Allocator (GPU_0_bfc) ran out of memory trying to allocate 24.00MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
    2021-10-22 15:42:33.983722: W tensorflow/core/kernels/] Failed to allocate memory for convolution redzone checking; skipping this check. This is benign and only means that we won't check cudnn for out-of-bounds reads and writes. This message will only be printed once.
    2021-10-22 15:42:42.844197: W tensorflow/core/common_runtime/] Allocator (GPU_0_bfc) ran out of memory trying to allocate 22.75MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
    2021-10-22 15:42:43.485925: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library
    2021-10-22 15:42:45.070240: W tensorflow/core/common_runtime/] Allocator (GPU_0_bfc) ran out of memory trying to allocate 16.00MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
    2021-10-22 15:42:45.094177: W tensorflow/core/common_runtime/] Allocator (GPU_0_bfc) ran out of memory trying to allocate 16.00MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
    2021-10-22 15:42:55.842108: I tensorflow/core/common_runtime/] 2 Chunks of size 1474560 totalling 2.81MiB
    2021-10-22 15:42:55.842169: I tensorflow/core/common_runtime/] 1 Chunks of size 27442176 totalling 26.17MiB
    2021-10-22 15:42:55.842230: I tensorflow/core/common_runtime/] Sum Total of in-use chunks: 58.26MiB
    2021-10-22 15:42:55.842290: I tensorflow/core/common_runtime/] total_region_allocated_bytes_: 84848640 memory_limit_: 84848640 available bytes: 0 curr_region_allocation_bytes_: 134217728
    2021-10-22 15:42:55.869607: I tensorflow/core/common_runtime/] Stats: 
    Limit:                        84848640
    InUse:                        61095168
    MaxInUse:                     68186112
    NumAllocs:                        1583
    MaxAllocSize:                 27442176
    Reserved:                            0
    PeakReserved:                        0
    LargestFreeBlock:                    0
    2021-10-22 15:42:55.869973: W tensorflow/core/common_runtime/] ****************************************___*******************************xxx_______________________
    2021-10-22 15:42:55.936569: W tensorflow/core/framework/] OP_REQUIRES failed at : Resource exhausted: OOM when allocating tensor with shape[1,96,256,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
    Traceback (most recent call last):
      File "", line 263, in <module>
      File "", line 234, in main
        tf_images, tf_detections = run(tf_batcher, tf_infer, "TensorFlow", args.nms_threshold)
      File "", line 124, in run
        res_detections += inferer.infer(batch, scales, nms_threshold)
      File "", line 77, in infer
        output = self.pred_fn(**input)
      File "/media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/eager/", line 1711, in __call__
        return self._call_impl(args, kwargs)
      File "/media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/eager/", line 247, in _call_impl
        args, kwargs, cancellation_manager)
      File "/media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/eager/", line 1729, in _call_impl
        return self._call_with_flat_signature(args, kwargs, cancellation_manager)
      File "/media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/eager/", line 1778, in _call_with_flat_signature
        return self._call_flat(args, self.captured_inputs, cancellation_manager)
      File "/media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/eager/", line 1961, in _call_flat
        ctx, args, cancellation_manager=cancellation_manager))
      File "/media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/eager/", line 596, in call
      File "/media/mydisk/MyDocuments/PyProjects/automl/efficientdet/venv/lib/python3.6/site-packages/tensorflow/python/eager/", line 60, in quick_execute
        inputs, attrs, num_outputs)
    tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
      (0) Resource exhausted:  OOM when allocating tensor with shape[1,96,256,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
    	 [[node efficientnet-b0/blocks_1/tpu_batch_normalization/FusedBatchNormV3 (defined at ]]
    Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
    Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
      (1) Resource exhausted:  OOM when allocating tensor with shape[1,96,256,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
    	 [[node efficientnet-b0/blocks_1/tpu_batch_normalization/FusedBatchNormV3 (defined at ]]
    Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
    0 successful operations.
    0 derived errors ignored. [Op:__inference_pruned_42115]
    Function call stack:
    pruned -> pruned
    real	7m57.829s
    user	6m34.100s
    sys	0m14.384s
上一篇:蓝桥杯 1111: Cylinder
