tensorRT安装及加速YOLOv4(tiny)

环境

cuda 10.1
cudnn 764

tensorRT安装

本文采用tkDNN的方法,根据YOLOv4 作者AlexeyAB推荐,可以支持(tiny) YOLO v1~YOLO v4的加速编译。其中,tkDNN-TensorRT 可以加速YOLOv4到 2倍(batch=1),3-4 倍(batch=4)。本文以tiny YOLOv4模型在TX2上测试,batch=1时大概提速50%左右。

tensorrt6
tensorrt8

cd Downloads
tar -xvf TensorRT-6.0.1.5.Ubuntu-18.04.x86_64-gnu.cuda-10.1.cudnn7.6.tar.gz
cd TensorRT-6.0.1.5/python
source activate py37             #要用对应的python版本来安装,我这是激活conda的python3.7版本环境去安装
pip3 install tensorrt-6.0.1.5-cp37-none-linux_x86_64.whl
gedit ~/.bashrc
export LD_LIBRARY_PATH="/home/lxj/Downloads/TensorRT-6.0.1.5/targets/x86_64-linux-gnu/lib:$LD_LIBRARY_PATH"
source ~/.bashrc
cd ../uff
pip3 install uff-0.6.5-py2.py3-none-any.whl 
cd ../graphsurgeon
pip3 install graphsurgeon-0.4.1-py2.py3-none-any.whl

tensorRT版本测试

source activate py37
python
import tensorrt
tensorrt.__version__
#'6.0.1.5'

yaml安装

git clone https://github.com/jbeder/yaml-cpp.git
mkdir build
cd build
cmake -DBUILD_SHARED_LIBS=ON ..
make
sudo make install

darknet权重文件解析

训练好的yolo模型yolo4tiny.weights、配置文件yolo4tiny.cfg 和分类名文件coco.names

git clone https://git.hipert.unimore.it/fgatti/darknet.git     #编译时,设置Makefile里面的GPU=0
cd darknet
make
mkdir layers debug
./darknet export <path-to-cfg-file> <path-to-weights> layers   #要修改文件路径

解析出来的格式如下:

model
     |---- layers/ (包含每层的权重参数)
     |----------- *.bin
     |---- debug/  (包含每层的输出参数)
     |----------- *_out.bin

tkDNN编译

先试试能不能通过编译

git clone https://github.com/ceccocats/tkDNN    
cd tkDNN
mkdir build
cd build
cmake .. 
make

如果出现以下bug

install dir:/usr/local
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_nvinfer_LIBRARY
linked by target "tkDNN" in directory /home/zengxiaojia/ClionProjects/tkDNN

这是系统找不到nvinfer模块。
解决:去到TensorRT安装目录把所有头文件都拷到/usr/include中去

sudo cp ./lib/libnvinfer.so /usr/lib
sudo cp ./include/* /usr/include/

tkDNN修改

在tkDNN/build新建文件夹,如yolo4tiny, 把darknet编译出来的两个文件夹拖进去
修改tkDNN/tests/darknet/yolo4tiny.cpp文件,改成自己的路径,将cfg_path、wgs_path和name_path分别换位
自己的路径,注释download那句话
tensorRT安装及加速YOLOv4(tiny)

make
./test_yolo4tiny     
./demo                    # 测试可以看下demo.cpp

tensorRT安装及加速YOLOv4(tiny)

tensorRT安装及加速YOLOv4(tiny)
反正就是快了,精度低了,但是能这么多我是没想到的

这是没做加速的

tensorRT安装及加速YOLOv4(tiny)

tensorRT安装及加速YOLOv4(tiny)

https://blog.csdn.net/mathlxj/article/details/107810548
https://blog.csdn.net/Lhj0616/article/details/115144420
https://blog.csdn.net/u010881576/article/details/107239170

上一篇:机器学习算法(二): 朴素贝叶斯(Naive Bayes)Tesk02


下一篇:《统计学习方法》啃书辅助:第 4 章 朴素贝叶斯法