文章目录
前言
Ubuntu+RTX3080源码安装pytorch_geometric库。
一、pytorch_geometric是什么?
pytorch_geometric是用于图神经网络学习的Pytorch扩展库,集成了常见的图神经网络结构以及经典方法。详细信息可见其官方文档。
二、安装步骤
2.1 安装编译好的版本
2.1.1 确保合适的pytorch版本
$ python -c "import torch; print(torch.__version__)"
>>> 1.6.0
2.1.2 确保合适的cuda版本
$ python -c "import torch; print(torch.version.cuda)"
>>> 10.2
2.1.3 安装下面的包
pip install torch-scatter==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html
pip install torch-sparse==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html
pip install torch-cluster==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html
pip install torch-spline-conv==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html
pip install torch-geometric
其中 ${CUDA} 和 ${TORCH} 需要替换成自己机器上的版本,可通过下面的命令查看cuda版本(如cpu,cu92, cu101, cu102):
cat /usr/local/cuda/version.txt
以及自己的pyotrch版本(1.4.0, 1.5.0, 1.6.0)。例如,对于PyTorch 1.5.0/1.5.1 and CUDA 10.2,:
pip install torch-scatter==latest+cu102 -f https://pytorch-geometric.com/whl/torch-1.5.0.html
pip install torch-sparse==latest+cu102 -f https://pytorch-geometric.com/whl/torch-1.5.0.html
pip install torch-cluster==latest+cu102 -f https://pytorch-geometric.com/whl/torch-1.5.0.html
pip install torch-spline-conv==latest+cu102 -f https://pytorch-geometric.com/whl/torch-1.5.0.html
pip install torch-geometric
对于PyTorch 1.6.0 and CUDA 10.1, type:
should be replaced by your specific CUDA version (cpu, cu92, cu101, cu102) and PyTorch version (1.4.0, 1.5.0, 1.6.0), respectively. For example, for PyTorch 1.5.0/1.5.1 and CUDA 10.2:
pip install torch-scatter==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.6.0.html
pip install torch-sparse==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.6.0.html
pip install torch-cluster==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.6.0.html
pip install torch-spline-conv==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.6.0.html
pip install torch-geometric
代码如下(示例):
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
2.2 源码安装
由于我的机器是最新的RTX 3080,cuda只能是11.0,而该库不支持这个新版本的cuda 11.0,所以只能从源码进行安装。(RTX 3080,喜忧参半)
2.2.1 检查pytorch是否有cuda的支持
$ python -c "import torch; print(torch.cuda.is_available())"
>>> True
2.2.2 将cuda 添加到环境变量中
$ export PATH=/usr/local/cuda/bin:$PATH
$ echo $PATH
>>> /usr/local/cuda/bin:...
$ export CPATH=/usr/local/cuda/include:$CPATH
$ echo $CPATH
>>> /usr/local/cuda/include:...
2.2.3 将cuda添加到动态库中
$ export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
$ echo $LD_LIBRARY_PATH
>>> /usr/local/cuda/lib64:...
$ export DYLD_LIBRARY_PATH=/usr/local/cuda/lib:$DYLD_LIBRARY_PATH
$ echo $DYLD_LIBRARY_PATH
>>> /usr/local/cuda/lib:...
2.2.4 确保pytorch和系统的cuda版本保持一致
$ python -c "import torch; print(torch.version.cuda)"
>>> 11.0
$ nvcc --version
>>> 11.0
注:如果不同的话,可以安装当前pytorch对应的cuda版本,或者安装对应cuda版本的pytorch
2.2.5 安装下面的包
pip install torch-scatter
pip install torch-sparse
pip install torch-cluster
pip install torch-spline-conv
pip install torch-geometric
在安装过程中有点慢,但没有出现任何错误。感觉非常开心!但后面发现该库的作者在下面给出了一段话,大致意思是:
在罕见的情况下,安装的时候不出错但运行的时候会出现一些做出,并贴心地附上了常见错误的解决方法
这…,我这个3080应该是属于这种情况了。赶紧进入python环境导入一个包试试:
import torch_geometric
运行结果果然是报错!淦!而且也不属于官方给的一些常见问题。:
>>> import torch_geometric
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_geometric/__init__.py", line 2, in <module>
import torch_geometric.nn
File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_geometric/nn/__init__.py", line 2, in <module>
from .data_parallel import DataParallel
File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_geometric/nn/data_parallel.py", line 5, in <module>
from torch_geometric.data import Batch
File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_geometric/data/__init__.py", line 1, in <module>
from .data import Data
File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_geometric/data/data.py", line 7, in <module>
from torch_sparse import coalesce, SparseTensor
File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_sparse/__init__.py", line 35, in <module>
from .tensor import SparseTensor # noqa
File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_sparse/tensor.py", line 11, in <module>
class SparseTensor(object):
File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch/jit/_script.py", line 924, in script
_compile_and_register_class(obj, _rcb, qualified_name)
File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch/jit/_script.py", line 64, in _compile_and_register_class
torch._C._jit_script_class_compile(qualified_name, ast, defaults, rcb)
RuntimeError:
Expected a default value of type Tensor (inferred) on parameter "tensor".Because "tensor" was not annotated with an explicit type it is assumed to be type 'Tensor'.:
File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_sparse/tensor.py", line 126
def type_as(self, tensor=torch.Tensor):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
value = self.storage.value()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
if value is None or tensor.dtype == value.dtype:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
return self
~~~~~~~~~~~
return self.from_storage(self.storage.type_as(tensor))
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
经过各种查找之后,发现是其中一个依赖库:torch_sparse的问题,也有人提到,但没有说具体说明原因。修改方法是:
将出错的文件tensor.py 的第126行从def type_as(self, tensor=torch.Tensor):
改为def type_as(self, tensor:torch.Tensor):
即可运行成功!
总结
虽然最后不知道是什么原因导致了这个问题,但最后还是解决了。感觉不是3080的问题,希望后面一切顺利~