Pytorch Geometric 源码安装记录

文章目录


前言

Ubuntu+RTX3080源码安装pytorch_geometric库。

一、pytorch_geometric是什么?

pytorch_geometric是用于图神经网络学习的Pytorch扩展库,集成了常见的图神经网络结构以及经典方法。详细信息可见其官方文档

二、安装步骤

2.1 安装编译好的版本

2.1.1 确保合适的pytorch版本

$ python -c "import torch; print(torch.__version__)"
>>> 1.6.0

2.1.2 确保合适的cuda版本

$ python -c "import torch; print(torch.version.cuda)"
>>> 10.2

2.1.3 安装下面的包

pip install torch-scatter==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html
pip install torch-sparse==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html
pip install torch-cluster==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html
pip install torch-spline-conv==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html
pip install torch-geometric

其中 ${CUDA} 和 ${TORCH} 需要替换成自己机器上的版本,可通过下面的命令查看cuda版本(如cpu,cu92, cu101, cu102):

cat /usr/local/cuda/version.txt

以及自己的pyotrch版本(1.4.0, 1.5.0, 1.6.0)。例如,对于PyTorch 1.5.0/1.5.1 and CUDA 10.2,:

pip install torch-scatter==latest+cu102 -f https://pytorch-geometric.com/whl/torch-1.5.0.html
pip install torch-sparse==latest+cu102 -f https://pytorch-geometric.com/whl/torch-1.5.0.html
pip install torch-cluster==latest+cu102 -f https://pytorch-geometric.com/whl/torch-1.5.0.html
pip install torch-spline-conv==latest+cu102 -f https://pytorch-geometric.com/whl/torch-1.5.0.html
pip install torch-geometric

对于PyTorch 1.6.0 and CUDA 10.1, type:

should be replaced by your specific CUDA version (cpu, cu92, cu101, cu102) and PyTorch version (1.4.0, 1.5.0, 1.6.0), respectively. For example, for PyTorch 1.5.0/1.5.1 and CUDA 10.2:

pip install torch-scatter==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.6.0.html
pip install torch-sparse==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.6.0.html
pip install torch-cluster==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.6.0.html
pip install torch-spline-conv==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.6.0.html
pip install torch-geometric

代码如下(示例):

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
import  ssl
ssl._create_default_https_context = ssl._create_unverified_context

2.2 源码安装

由于我的机器是最新的RTX 3080,cuda只能是11.0,而该库不支持这个新版本的cuda 11.0,所以只能从源码进行安装。(RTX 3080,喜忧参半)

2.2.1 检查pytorch是否有cuda的支持

$ python -c "import torch; print(torch.cuda.is_available())"
>>> True

2.2.2 将cuda 添加到环境变量中

$ export PATH=/usr/local/cuda/bin:$PATH
$ echo $PATH
>>> /usr/local/cuda/bin:...

$ export CPATH=/usr/local/cuda/include:$CPATH
$ echo $CPATH
>>> /usr/local/cuda/include:...

2.2.3 将cuda添加到动态库中

$ export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
$ echo $LD_LIBRARY_PATH
>>> /usr/local/cuda/lib64:...

$ export DYLD_LIBRARY_PATH=/usr/local/cuda/lib:$DYLD_LIBRARY_PATH
$ echo $DYLD_LIBRARY_PATH
>>> /usr/local/cuda/lib:...

2.2.4 确保pytorch和系统的cuda版本保持一致

$ python -c "import torch; print(torch.version.cuda)"
>>> 11.0

$ nvcc --version

>>> 11.0

注:如果不同的话,可以安装当前pytorch对应的cuda版本,或者安装对应cuda版本的pytorch

2.2.5 安装下面的包

pip install torch-scatter
pip install torch-sparse
pip install torch-cluster
pip install torch-spline-conv
pip install torch-geometric

在安装过程中有点慢,但没有出现任何错误。感觉非常开心!但后面发现该库的作者在下面给出了一段话,大致意思是:

在罕见的情况下,安装的时候不出错但运行的时候会出现一些做出,并贴心地附上了常见错误的解决方法

这…,我这个3080应该是属于这种情况了。赶紧进入python环境导入一个包试试:

import torch_geometric

运行结果果然是报错!淦!而且也不属于官方给的一些常见问题。:

>>> import torch_geometric
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_geometric/__init__.py", line 2, in <module>
    import torch_geometric.nn
  File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_geometric/nn/__init__.py", line 2, in <module>
    from .data_parallel import DataParallel
  File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_geometric/nn/data_parallel.py", line 5, in <module>
    from torch_geometric.data import Batch
  File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_geometric/data/__init__.py", line 1, in <module>
    from .data import Data
  File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_geometric/data/data.py", line 7, in <module>
    from torch_sparse import coalesce, SparseTensor
  File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_sparse/__init__.py", line 35, in <module>
    from .tensor import SparseTensor  # noqa
  File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_sparse/tensor.py", line 11, in <module>
    class SparseTensor(object):
  File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch/jit/_script.py", line 924, in script
    _compile_and_register_class(obj, _rcb, qualified_name)
  File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch/jit/_script.py", line 64, in _compile_and_register_class
    torch._C._jit_script_class_compile(qualified_name, ast, defaults, rcb)
RuntimeError: 
Expected a default value of type Tensor (inferred) on parameter "tensor".Because "tensor" was not annotated with an explicit type it is assumed to be type 'Tensor'.:
  File "/home/haowei/anaconda3/envs/cellDet/lib/python3.7/site-packages/torch_sparse/tensor.py", line 126
    def type_as(self, tensor=torch.Tensor):
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        value = self.storage.value()
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        if value is None or tensor.dtype == value.dtype:
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            return self
            ~~~~~~~~~~~
        return self.from_storage(self.storage.type_as(tensor))
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE

经过各种查找之后,发现是其中一个依赖库:torch_sparse的问题,也有人提到,但没有说具体说明原因。修改方法是:
将出错的文件tensor.py 的第126行从def type_as(self, tensor=torch.Tensor):改为def type_as(self, tensor:torch.Tensor):即可运行成功!

总结

虽然最后不知道是什么原因导致了这个问题,但最后还是解决了。感觉不是3080的问题,希望后面一切顺利~

上一篇:Windows环境下PyTorch_geometric安装踩坑


下一篇:缓冲区溢出