本文详细介绍了如何直接拉取、配置适用于深度神经网络训练的环境
docker环境配置 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 # 给docker安装nvidia-container-runtime,让container能使用宿主机的nvidia显卡 教程网址:http://www.manongjc.com/detail/24-qcliirtikyklgea.html 配置nvidia仓库: curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \ sudo apt-key add - distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \ sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list sudo apt-get update 安装运行时 sudo apt-get install nvidia-container-runtime 停止docker systemctl stop docker 把运行时添加到docker中: dockerd --add-runtime=nvidia=/usr/bin/nvidia-container-runtime # repalce [your name] by your name docker run --gpus=all -dit -v /home/henry/asc:/home/asc --name [your name] tensorflow/tensorflow:2.6.1-gpu /bin/bash docker exec -it [container id] /bin/bash # 配置环境 apt update apt upgrade apt install git cmake vim cd /your_deepmd-kit_path # 切换分支 git clone --recursive https://github.com/deepmodeling/deepmd-kit.git deepmd-kit git checkout -b asc22 remotes/origin/asc-2022 export DP_VARIANT="cuda" pip install . # 可以采用开发者模式安装 export DP_VARIANT="cuda" python setup.py develop or pip install -v -e . # 配置训练环境 dpkg -i nccl-local-repo-ubuntu2004-2.8.4-cuda11.2_1.0-1_amd64.deb apt install libnccl2=2.8.4-1+cuda11.2 libnccl-dev=2.8.4-1+cuda11.2 apt install libopenmpi-dev HOROVOD_WITHOUT_GLOO=1 HOROVOD_WITH_TENSORFLOW=1 HOROVOD_GPU_OPERATIONS=NCCL pip install horovod mpi4py # 训练 cd /your_data_path CUDA_VISIBLE_DEVICES=0 horovodrun -np 1 dp train --mpi-log=workers input.json
代理 pip 1 2 3 4 5 6 # ~/.pip/pip.conf [global] index-url = https://pypi.tuna.tsinghua.edu.cn/simple proxy = http://58.199.160.174:3128 [install] trusted-host=pypi.tuna.tsinghua.edu.cn
git 1 2 # ~/.gitconfig Acquire::http::Proxy "http://58.199.160.174:3128";
apt 1 2 # /etc/apt/apt.conf Acquire::http::Proxy "http://58.199.160.174:3128";
wget 1 2 3 4 5 6 7 8 # ~/.wgetrc https_proxy = http://10.0.65.18:8888/ http_proxy = http://10.0.65.18:8888/ ftp_proxy = http://10.0.65.18:8888/ # If you do not want to use proxy at all, set this to off. use_proxy = on
如果您喜欢此博客或发现它对您有用,则欢迎对此发表评论。 也欢迎您共享此博客,以便更多人可以参与。 如果博客中使用的图像侵犯了您的版权,请与作者联系以将其删除。 谢谢 !