架构图
安装gpu runtime
https://nvidia.github.io/nvidia-container-runtime/
-----
cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
------
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.repo | sudo tee /etc/yum.repos.d/nvidia-container-runtime.repo
sudo yum install nvidia-container-runtime
-----
cat /etc/docker/daemon.json
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
},
"registry-mirrors": ["http://28cf50e2.m.daocloud.io"],
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
],
"graph": "/data/data/docker",
"log-driver": "json-file",
"log-opts": {
"max-size": "300m",
"max-file": "3"
}
}
---
systemctl daemon-reload
下载官方镜像 我司使用基于centos 的gpu 版本
https://hub.docker.com/r/nvidia/cuda/tags?page=1&ordering=last_updated
docker pull nvidia/cuda:11.4.1-devel-centos8
docker run -d --gpus '"device=1,2"' nvidia/cuda:11.4.1-devel-centos8 sleep 30000000000
nvidia-smi Wed Sep 15 11:25:38 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 418.197.02 Driver Version: 418.197.02 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla V100S-PCI... Off | 00000000:AF:00.0 Off | 0 | | N/A 70C P0 216W / 250W | 6295MiB / 32480MiB | 100% Default | +-------------------------------+----------------------+----------------------+ | 1 Tesla V100S-PCI... Off | 00000000:D8:00.0 Off | 0 | | N/A 66C P0 207W / 250W | 6295MiB / 32480MiB | 100% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| +-----------------------------------------------------------------------------+