Ubuntu16.04+cuda9.0+matlab+opencv3.3+caffe服务器配置(附遇到的错误和解决方法)
1.具体安装前需要的依赖包:
ubuntu dependency:
sudo apt-get install --assume-yes libopencv-dev build-essential cmake git libgtk2.0-dev pkg-config python-dev python-numpy libdc1394-22 libdc1394-22-dev libjpeg-dev libpng12-dev libtiff5-dev libjasper-dev libavcodec-dev libavformat-dev libswscale-dev libxine2-dev libgstreamer0.10-dev libgstreamer-plugins-base0.10-dev libv4l-dev libtbb-dev libqt4-dev libfaac-dev libmp3lame-dev libopencore-amrnb-dev libopencore-amrwb-dev libtheora-dev libvorbis-dev libxvidcore-dev x264 v4l-utils unzip
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --no-install-recommends libboost-all-dev
sudo apt-get install libopenblas-dev liblapack-dev libatlas-base-dev
sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev
sudo apt-get install libblas-dev
sudo apt install libatlas-base-dev
opencv dependency:
sudo apt-get install build-essential cmake git
sudo apt-get install ffmpeg libopencv-dev libgtk-3-dev python-numpy python3-numpy libdc1394-22 libdc1394-22-dev libjpeg-dev libpng12-dev libtiff5-dev libjasper-dev libavcodec-dev libavformat-dev libswscale-dev libxine2-dev libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev libv4l-dev libtbb-dev qtbase5-dev libfaac-dev libmp3lame-dev libopencore-amrnb-dev libopencore-amrwb-dev libtheora-dev libvorbis-dev libxvidcore-dev x264 v4l-utils unzip
2. 安装Nvidia 显卡驱动:
安装文件:NVIDIA-Linux-x86_64-384.98.run (与Titan Xp 显卡配套)
命令: sudo sh NVIDIA-Linux-x86_64-384.98.run
检验方法:nvidia-smi 出现显卡信息
3. 安装Cuda 9.0:
安装deb文:cuda-repo-ubuntu1604-9-0-local-rc_9.0.103-1_amd64.deb(与Nvidia驱动配套)
命令: sudo dpkg -i cuda-repo-ubuntu1604-9-0-local-rc_9.0.103-1_amd64.deb
sudo apt-key add /var/cuda.../7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda
声明环境变量:
sudo gedit ~/.bashrc
添加:export PATH=/usr/local/cuda-8.0/bin\({PATH:+:\){PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64\({ LD_LIBRARY_PATH:+:\)
{ LD_LIBRARY_PATH }}
检验方法:
cd /usr/local/cuda-9.0/samples/1_Utilities/deviceQuery
make
sudo ./deviceQuery 出现GPU信息
4.安装cuDNN:
安装文件:cuDNN v7.0.5
命令:进入include
sudo cp cudnn.h /usr/local/cuda/include/
进入lib64
sudo cp lib* /usr/local/cuda/lib64/
cd /usr/local/cuda/lib64/
sudo rm -rf libcudnn.so libcudnn.so.7 #删除原有动态文件
sudo ln -s libcudnn.so.7.0.5 libcudnn.so.7 #生成软链接(注意这里要和自己下载的cudnn版本对应,可以在/usr/local/cuda/lib64下查看自己libcudnn的版本)
sudo ln -s libcudnn.so.7 libcudnn.so
5.安装opencv:
安装文件:opencv 3.3.0
为解决与cuda9.0不兼容的问题,用以下方法解决:
问题:CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_nppi_LIBRARY (ADVANCED)
linked by target "opencv_cudev" in directory D:/Cproject/opencv/opencv/sources/modules/cudev
...
解决方案:http://blog.csdn.net/u014613745/article/details/78310916
1).找到FindCUDA.cmake文件
找到行
find_cuda_helper_libs(nppi)
改为
find_cuda_helper_libs(nppial)
find_cuda_helper_libs(nppicc)
find_cuda_helper_libs(nppicom)
find_cuda_helper_libs(nppidei)
find_cuda_helper_libs(nppif)
find_cuda_helper_libs(nppig)
find_cuda_helper_libs(nppim)
find_cuda_helper_libs(nppist)
find_cuda_helper_libs(nppisu)
find_cuda_helper_libs(nppitc)
2).找到行
set(CUDA_npp_LIBRARY "\({CUDA_nppc_LIBRARY};\){CUDA_nppi_LIBRARY};\({CUDA_npps_LIBRARY}") 改为 set(CUDA_npp_LIBRARY "\){CUDA_nppc_LIBRARY};\({CUDA_nppial_LIBRARY};\){CUDA_nppicc_LIBRARY};\({CUDA_nppicom_LIBRARY};\){CUDA_nppidei_LIBRARY};\({CUDA_nppif_LIBRARY};\){CUDA_nppig_LIBRARY};\({CUDA_nppim_LIBRARY};\){CUDA_nppist_LIBRARY};\({CUDA_nppisu_LIBRARY};\){CUDA_nppitc_LIBRARY};${CUDA_npps_LIBRARY}")
3).找到行
unset(CUDA_nppi_LIBRARY CACHE)
改为
unset(CUDA_nppial_LIBRARY CACHE)
unset(CUDA_nppicc_LIBRARY CACHE)
unset(CUDA_nppicom_LIBRARY CACHE)
unset(CUDA_nppidei_LIBRARY CACHE)
unset(CUDA_nppif_LIBRARY CACHE)
unset(CUDA_nppig_LIBRARY CACHE)
unset(CUDA_nppim_LIBRARY CACHE)
unset(CUDA_nppist_LIBRARY CACHE)
unset(CUDA_nppisu_LIBRARY CACHE)
unset(CUDA_nppitc_LIBRARY CACHE)
4).找到文件OpenCVDetectCUDA.cmake
修改以下几行
...
set(__cuda_arch_ptx "")
if(CUDA_GENERATION STREQUAL "Fermi")
set(__cuda_arch_bin "2.0")
elseif(CUDA_GENERATION STREQUAL "Kepler")
set(__cuda_arch_bin "3.0 3.5 3.7")
...
改为
...
set(__cuda_arch_ptx "")
if(CUDA_GENERATION STREQUAL "Kepler")
set(__cuda_arch_bin "3.0 3.5 3.7")
elseif(CUDA_GENERATION STREQUAL "Maxwell")
set(__cuda_arch_bin "5.0 5.2")
...
问题:ippicv下载不下来,无法继续编译
解决方法:把ippicv_2017u2_lnx_intel64_20170418.tgz在github上的opencv 3rdparty中下载下来;查看/3rdparty/ippicv/ippicv.cmake文件,将下载下来的.tagz文件重新命名成“对应的hash码-ippicv_linux_20170418.tgz”; 将重命名的文件保存至opencv3.3.0/.cache/ippicv下。
问题:
5).cuda9中有一个单独的halffloat(cuda_fp16.h)头文件,也应该被包括在opencv的目录里,将头文件cuda_fp16.h添加至 opencv\modules\cudev\include\opencv2\cudev\common.hpp,即在common.hpp中添加
include
随后进入opencv,进行编译:
mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D CUDA_GENERATION=Kepler ..
sudo make
sudo make install
测试;
6. matlab安装:
安装文件:R2016a.iso文件
mkdir ~/matlab_iso
sudo mount -o loop R2016a_glnxa64.iso ~/matlab_iso
cd ~/matlab_iso
sudo ./install
不使用Internet激活,秘钥:09806-07443-53955-64350-21751-41297
安装路径:/usr/local/MATLAB/R2016a
安装完成后将libmwservices.so复制到/usr/local/MATLAB/R2014a/bin/glnxa64中:
sudo cp libmwservices.so /usr/local/MATLAB/R2016a/bin/glnxa64/libmwservices.so
7.多gpu编程,安装nccl
git clone https://github.com/NVIDIA/nccl.git
cd nccl
sudo make install -j4
8. 安装caffe
安装文件:github下载caffe-master
命令:
cd caffe-master
sudo cp Makefile.config.example Makefile.config
sudo gedit Makefile.config
(1)修改Makefile.config文件
若使用cudnn,则将# USE_CUDNN := 1 修改成: USE_CUDNN := 1
若使用的opencv版本是3的,则将# OPENCV_VERSION := 3 修改为: OPENCV_VERSION := 3
若要使用python来编写layer,则需要将# WITH_PYTHON_LAYER := 1 修改为 WITH_PYTHON_LAYER := 1
取消对行 USE_NCCL := 1 的注释。这可以启用在多个 GPU 上运行 Caffe 所需的 NCCL。
将# Whatever else you find you need goes here.下面的 INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
修改为: INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial //这是因为ubuntu16.04的文件包含位置发生了变化,尤其是需要用到的hdf5的位置,所以需要更改这一路径
删除MakeFile.config 中关于compute_20 compute_21的内容来兼容CUDA>=9.0
若使用MATLAB接口的话,则要MATLAB_DIR换成你自己的MATLAB安装路径
MATLAB_DIR := /usr/local
MATLAB_DIR := /usr/local/MATLAB/R2016a
(2)打开makefile文件
将
NVCCFLAGS +=-ccbin=$(CXX) -Xcompiler-fPIC \((COMMON_FLAGS) 替换 NVCCFLAGS += -D_FORCE_INLINES -ccbin=\)(CXX) -Xcompiler -fPIC $(COMMON_FLAGS)
最后:
sudo make all -j8
sudo make test
sudo make runtest
sudo make pycaffe
sudo make matcaffe
出现问题:NVCC src/caffe/test/test_im2col_kernel.cu
nvcc fatal:Unsupported gpu architecture 'compute_20'
解决方案:删除MakeFile.config 中关于compute_20 compute_21的内容来兼容CUDA>=9.0
注意:安装caffe前,确认/usr/local/cuda-9.0 下有bin目录
9.安装ssh远程服务
sudo apt-get install openssh-server
打开"终端窗口",输入"sudo ps -e |grep ssh"-->回车-->有sshd,说明ssh服务已经启动,如果没有启动,输入"sudo service ssh start"-->回车-->ssh服务就会启动
打开"终端窗口",输入"sudo gedit /etc/ssh/sshd_config"-->回车-->把配置文件中的"PermitRootLogin without-password"加一个"#"号,把它注释掉-->再增加一句"PermitRootLogin yes"-->保存,修改成功。
注:蓝色字体为命令行命令