工作环境
显卡:GPU 系统:Ubuntu 16.04.5 LTS cuda:10.0 Python:3.x
1. 创建conda环境
官网下载地址:https://www.anaconda.com/distribution/#download-section
下载合适的安装文件,然后运行。
1 cd init 2 sudo wget https://repo.anaconda.com/archive/Anaconda3-2019.03-MacOSX-x86_64.pkg 3 bash Anaconda3-2019.03-Linux-x86_64.sh
根据提示操作,并选择安装目录,默认安装在~/anaconda3/ 目录下。
注:初始化操作
1、如果默认不初始化,则安装之后,没有conda命令,需要手动初始化
installation finished.
Do you wish the installer to initialize Anaconda3
by running conda init? [yes|no]
[no] >>>
You have chosen to not have conda modify your shell scripts at all.
To activate conda's base environment in your current shell session:
eval "$(/home/cortex/anaconda3/bin/conda shell.YOUR_SHELL_NAME hook)"
To install conda's shell functions for easier access, first activate, then:
conda init
If you'd prefer that conda's base environment not be activated on startup,
set the auto_activate_base parameter to false:
conda config --set auto_activate_base false
Thank you for installing Anaconda3!
===========================================================================
Anaconda and JetBrains are working together to bring you Anaconda-powered
environments tightly integrated in the PyCharm IDE.
PyCharm for Anaconda is available at:
https://www.anaconda.com/pycharm
2、如果选择初始化,则会修改~/.bashrc文件,并创建conda命令
installation finished. Do you wish the installer to initialize Anaconda3 by running conda init? [yes|no] "deeplearning" 105L, 3558C written installation finished. Do you wish the installer to initialize Anaconda3 by running conda init? [yes|no] [no] >>> yes WARNING: The conda.compat module is deprecated and will be removed in a future release. no change /home/cortex/anaconda3/condabin/conda no change /home/cortex/anaconda3/bin/conda no change /home/cortex/anaconda3/bin/conda-env no change /home/cortex/anaconda3/bin/activate no change /home/cortex/anaconda3/bin/deactivate no change /home/cortex/anaconda3/etc/profile.d/conda.sh no change /home/cortex/anaconda3/etc/fish/conf.d/conda.fish no change /home/cortex/anaconda3/shell/condabin/Conda.psm1 no change /home/cortex/anaconda3/shell/condabin/conda-hook.ps1 no change /home/cortex/anaconda3/lib/python3.7/site-packages/xonsh/conda.xsh no change /home/cortex/anaconda3/etc/profile.d/conda.csh modified /home/cortex/.bashrc ==> For changes to take effect, close and re-open your current shell. <== If you'd prefer that conda's base environment not be activated on startup, set the auto_activate_base parameter to false: conda config --set auto_activate_base false Thank you for installing Anaconda3! =========================================================================== Anaconda and JetBrains are working together to bring you Anaconda-powered environments tightly integrated in the PyCharm IDE. PyCharm for Anaconda is available at: https://www.anaconda.com/pycharm
退出conda环境
1 conda deactivate
2. 进入conda py3.6
1 conda create -n py36 python=3.6 2 conda activate py36
3. 安装必要包
#修改清华的pip源
1 mkdir ~/.pip 2 touch ~/.pip/pip.conf
#pip.conf中写入以下内容
[global] index-url = https://pypi.tuna.tsinghua.edu.cn/simple
安装包
1 pip install numpy==1.16.2 2 pip install opencv-python==4.1.0.25 3 pip install keras==2.1.4 4 pip install tensorflow-gpu==1.13.1
4. 安装nccl2
下载地址:https://docs.nvidia.com/deeplearning/sdk/nccl-install-guide/index.html
根据系统和cuda版本下载对应的nccl2
1 sudo dpkg -i nccl-repo-ubuntu1604-2.4.7-ga-cuda10.0_1-1_amd64.deb 2 sudo apt-key add /var/nccl-repo-2.4.7-ga-cuda10.0/7fa2af80.pub(根据提示执行) 3 sudo apt update 4 sudo apt install libnccl2=2.4.7-1+cuda10.0 libnccl-dev=2.4.7-1+cuda10.0
5. 安装openmpi
下载地址:https://www.open-mpi.org/faq/?category=building#easy-build
1 sudo wget https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.1.tar.gz 2 gunzip -c openmpi-4.0.1.tar.gz | tar xf - 3 cd openmpi-4.0.1/ 4 sudo ./configure --prefix=/usr/local 5 sudo make all install
6. 安装horovod
文档说明:https://github.com/horovod/horovod/blob/master/docs/gpus.md
1 HOROVOD_GPU_ALLREDUCE=NCCL pip install --no-cache-dir horovod
安装过程中可能出现的问题:
<style></style>1、ImportError: libcudnn.so.7: cannot open shared object file: No such file or directory
根据版本,下载对应的文件:https://developer.nvidia.com/rdp/cudnn-download
1 sudo dpkg -i libcudnn7_7.6.0.64-1+cuda10.0_amd64.deb 2 sudo dpkg -i libcudnn7-dev_7.6.0.64-1+cuda10.0_amd64.deb