Torch，Tensorflow使用： Ubuntu14.04（x64）+ CUDA8.0 安装 Torch和Tensorflow

2022-08-06 16:22:09

系统配置：

Ubuntu14.04（x64）

CUDA8.0

cudnn-8.0-linux-x64-v5.1.tgz（Tensorflow依赖）

Anaconda

1. Torch安装

Torch是深度学习一个非常好的框架，使用人也特别多，之前一直使用caffe进行实验，最近一个实验需要在Torch下面跑，所以借此机会安装一下torch。

Torch的官方文档已经说的非常详细，安装可以直接按照官方文档进行，官方文档戳我。

首先从github中down下来torch，放在～/torch文件夹下面：

git clone https://github.com/torch/distro.git ~/torch --recursive

cd ~/torch;

bash install-deps;

./install.sh

在安装过程中，可能会出现权限问题，如果不行可以直接在命令前加入sudo 来使用管理员权限来进行操作。

在安装过程中，Torch安装程序已经自动将所需的PATH等环境变量写入环境变量配置文件中，我们需要做的就是将新的环境变量执行，使其生效。

# On Linux with bash

source ~/.bashrc

# On Linux with zsh

source ~/.zshrc

# On OSX or in Linux with none of the above.

source ~/.profile

如果没有出现问题，那么Torch已经安装成功。

Torch 的使用和python 非常类似，在命令行中输入 th ，则会出现下面信息：

此时，Torch已经成功安装。

2. Tensorflow安装

Tensorflow安装官网已经很完善，直接参考官网教程。

因为系统已经安装Torch和caffe，为了不使得自己的环境变量混乱，推荐使用anaconda安装Tensorflow，方便管理，快捷。

Anaconda安装Tensorflow教程戳我

Tensorflow-gpu1.2.1

1. 安装Cuda

首先下载Cuda8.0，然后进入下载目录，执行下列命令，即可安装Cuda

　　安装完成后，配置环境变量，在home下的.bashrc中加入

export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

export CUDA_HOME=/usr/local/cuda:$CUDA_HOME

2. 安装Cudnn

Cuda8.0支持Cudnn v5.0和v5.1，但是在安装tensorflow之后测试其示例代码mnist时，提示该代码基于Cudnn v5.1生成，因此我又改成了v5.1。
下载Cudnn v5.1，进入下载目录，执行下列命令：

tar xvzf cudnn-8.0-Linux-x64-v5.1.tgz

sudo cp cuda/include/cudnn.h /usr/local/cuda/include

sudo cp cuda/lib64/libcudnn.so* /usr/local/cuda/lib64

sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn.so*

上面第2、3行就是把Cudnn的头文件和库文件复制到Cuda路径下的include和lib目录。

3.使用Anaconda安装tensorflow

首先新建一个conda环境，命名为tensorflow

conda create -n tensorflow Python=2.7

然后激活该环境并在该环境下安装tensorflow

source activate tensorflow

由于使用conda安装的tensorflow，所以我们使用pip安装，

export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.1-cp27-none-linux_x86_64.whl

pip install –ignore-installed –upgrade $TF_BINARY_URL

如此，便完成GPU版本的tensorflow安装。使用完毕后，需要关闭tensorflow环境

source deactivate

可以简单测试一下tensorflow是否安装成功

python

...

>>> import tensorflow as tf

>>> hello = tf.constant('Hello, TensorFlow!')

>>> sess = tf.Session() #在该步会显示电脑的显卡信息

>>> print(sess.run(hello))

Hello, TensorFlow!

>>> a = tf.constant()

>>> b = tf.constant()

>>> print(sess.run(a + b))

测试mnist简单实验。

由于使用anaconda安装的tensorflow并没我model，所以需要从github上面git下来model文件夹。

cd /home/startag/anaconda/envs/tensorflow

git clone https://github.com/tensorflow/models.git

然后跑MNIST程序测试tensorflow性能：

cd /anaconda/envs/tensorflow/models/tutorials/image/mnist

python convolutional.py

输出实验结果：

Extracting data/train-images-idx3-ubyte.gz

Extracting data/train-labels-idx1-ubyte.gz

Extracting data/t10k-images-idx3-ubyte.gz

Extracting data/t10k-labels-idx1-ubyte.gz

2017-05-27 11:02:25.851275: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.

2017-05-27 11:02:25.851296: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.

2017-05-27 11:02:25.851301: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.

2017-05-27 11:02:25.851304: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.

2017-05-27 11:02:25.851306: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.

2017-05-27 11:02:25.945195: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

2017-05-27 11:02:25.945405: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties:

name: GeForce GTX TITAN X

major: 5 minor: 2 memoryClockRate (GHz) 1.076

pciBusID 0000:01:00.0

Total memory: 11.91GiB

Free memory: 429.00MiB

2017-05-27 11:02:25.945418: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0

2017-05-27 11:02:25.945421: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y

2017-05-27 11:02:25.945427: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:01:00.0)

Initialized!

Step 0 (epoch 0.00), 13.7 ms

Minibatch loss: 8.334, learning rate: 0.010000

Minibatch error: 85.9%

Validation error: 84.6%

参考：Ubuntu16.04lts使用Anaconda安装tensorflow并配置GPU

至此，Caffe升级工作，Torch，Tensorflow安装工作，历时3天终于完成。

期间踩坑无数，终于安装好了所需环境，希望后续实验可以顺利开展，取得好的实验结果。

3. Sublime配置tensorflow环境

在安装好tensorflow环境之后，已可以成功直接从terminal中使用tensorflow，但是在编写代码过程中，需要使用编辑器支持tensorflow的语法以及一些函数，这时就需要对sublime进行配置，使得sublime可以识别tensorflow的环境以及相关配置。

第一步：

安装anacoda编译器，使用sublime的package control管理器，安装anacoda编译器（ctrl+shift+p）；

然后在Preferences——>Package Settings中找到Anaconda，然后选择User-settings选项，将以下配置信息放到配置文件中：

{

    "python_interpreter": "/home/startag/anaconda/bin/python",

    "suppress_word_completions":true,

    "suppress_explicit_completions":true,

    "complete_parameters":true,

    "anaconda_linting": false

}

然后就可以正常import tensorflow啦。

4. 错误处理

1. 在使用tensorflow时候提示错误如下：

from google.protobuf import symbol_database as _symbol_database

ImportError: cannot import name symbol_database

这是因为在安装tensorflow时使用的是anaconda环境配置，将tensorflow使用conda命令单独创建了一个环境变量，在这个环境变量中并未能找到合适的protobuf进行编译。

解决方法：

sudo apt-get remove python-protobuf

这个是移除使用apt-get安装的protobuf，以免造成对protobuf的影响。

如果这个方法不奏效，那么使用下面的命令对protobuf进行升级，升级命令为：

sudo pip install --upgrade protobuf

该命令会将protobuf安装到系统自带的python2.7环境中。

需要开启tensorflow环境，即切换到anaconda环境中进行安装，方可对anaconda中的protobuf进行升级，命令为：

source activate tensorflow #开启tensorflow环境

使用pip对protobuf进行升级：

pip install --ignore-installed --upgrade protobuf

tips：

在使用anaconda时候，可以使用pip来对已经安装的包package进行管理和查看，以判别安装的package的版本是否正确：

$ pip show protobuf

Name: protobuf

Version: 3.3.

Summary: Protocol Buffers

Home-page: https://developers.google.com/protocol-buffers/

Author: protobuf@googlegroups.com

Author-email: protobuf@googlegroups.com

License: -Clause BSD License

Location: /home/startag/anaconda/lib/python2./site-packages

Requires: six, setuptools

关于anaconda的使用可参考： Anaconda多环境多版本python配置指导