UBUNTU22.04安装AMD/NVIDIA驱动+CUDA12.2+CUDNN

ORANGE_TIGER 2024-10-04 14:37:02 阅读 59

本文安装基于AMD显卡WX3100以及nvidia TESLA P40进行驱动安装

首先更新软件并安装依赖

<code>sudo apt-get update

sudo apt-get upgrade -y

sudo apt-get install g++

sudo apt-get install gcc

sudo apt-get install make

1.安装AMD显卡驱动

首先下载UNUNTU对应版本的显卡驱动:AMD驱动

来到驱动包对应路径下安装驱动包

sudo dpkg -i amdgpu-install_5.5.50503-1_all.deb

安装驱动:

sudo amdgpu-install --no-dkmscode>

sudo apt install rocm-dev

配置AMD驱动环境变量

ls -l /dev/dri/render*

sudo usermod -a -G render $LOGNAME

sudo usermod -a -G video $LOGNAME

重启电脑后在终端输入rocm-smi,若出现类似以下字符则说明安装成功

======================= ROCm System Management Interface =======================

================================= Concise Info =================================

GPU Temp (DieEdge) AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU%

0 34.0c 4.211W 734Mhz 300Mhz 19.22% auto 35.0W 11% 0%

================================================================================

============================= End of ROCm SMI Log ==============================

2.安装计算卡显卡驱动

首先在终端输入以下字符检查NVIDIA显卡型号以及可用驱动版本

ubuntu-drivers devices

tiger@EPIC-7302-S8030GM4NE-2T:~$ ubuntu-drivers devices

== /sys/devices/pci0000:00/0000:00:03.7/0000:06:00.0 ==

modalias : pci:v000010DEd00001B38sv000010DEsd000011D9bc03sc02i00

vendor : NVIDIA Corporation

model : GP102GL [Tesla P40]

driver : nvidia-driver-535 - distro non-free recommended

driver : nvidia-driver-450-server - distro non-free

driver : nvidia-driver-470-server - distro non-free

driver : nvidia-driver-390 - distro non-free

driver : nvidia-driver-418-server - distro non-free

driver : nvidia-driver-535-server - distro non-free

driver : nvidia-driver-545 - distro non-free

driver : nvidia-driver-470 - distro non-free

driver : xserver-xorg-video-nouveau - distro free builtin

选择推荐版本驱动进行安装:

sudo apt install nvidia-driver-535

也可以在查看完驱动版本后,使用UBUNTU自带的软件和更新进行驱动安装

安装完成后在终端输入nvidia-smi,出现类似以下字符则说明安装成功

tiger@EPIC-7302-S8030GM4NE-2T:~$ nvidia-smi

Wed Aug 7 23:37:11 2024

+---------------------------------------------------------------------------------+

| NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 |

|-----------------------------------+----------------------+----------------------+

| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |

| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |

| | | MIG M. |

|====================================+======================+======================|

| 0 Tesla P40 Off | 00000000:06:00.0 Off | Off |

| N/A 26C P8 10W / 250W | 4MiB / 24576MiB | 0% Default |

| | | N/A |

+-----------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------+

| Processes: |

| GPU GI CI PID Type Process name GPU Memory |

| ID ID Usage |

|==================================================================================|

| 0 N/A N/A 1271 G /usr/lib/xorg/Xorg 4MiB |

+----------------------------------------------------------------------------------+

3.安装CUDA

首先在官网下载对应驱动支持的版本的CUDA安装包:CUDA官网

此处以535驱动支持的CUDA-12.2为例

运行runfile代码安装CUDA

wget https://developer.download.nvidia.com/compute/cuda/12.2.2/local_installers/cuda_12.2.2_535.104.05_linux.run

sudo sh cuda_12.2.2_535.104.05_linux.run

安装时取消勾选显卡驱动即可

│ CUDA Installer │

│ - [ ] Driver │

│ [ ] 535.104.05 │

│ + [X] CUDA Toolkit 12.2 │

│ [X] CUDA Demo Suite 12.2 │

│ [X] CUDA Documentation 12.2 │

│ - [ ] Kernel Objects │

│ [ ] nvidia-fs │

│ Options │

│ Install │

出现以下内容则说明安装完成

tiger@EPIC-7302-S8030GM4NE-2T:~$ sudo sh cuda_12.2.2_535.104.05_linux.run

[sudo] tiger 的密码:

===========

= Summary =

===========

Driver: Not Selected

Toolkit: Installed in /usr/local/cuda-12.2/

Please make sure that

- PATH includes /usr/local/cuda-12.2/bin

- LD_LIBRARY_PATH includes /usr/local/cuda-12.2/lib64, or, add /usr/local/cuda-12.2/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-12.2/bin

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 535.00 is required for CUDA 12.2 functionality to work.

To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:

sudo <CudaInstaller>.run --silent --driver

Logfile is /var/log/cuda-installer.log

配置配置CUDA环境变量

sudo gedit ~/.bashrc

在最末尾添加地址,需将以下地址中的cuda-xx.x替换为对应的版本。例:cuda-12.2

export PATH=$PATH:/usr/local/cuda-xx.x/bin

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-xx.x/lib64

export CUDA_HOME=$CUDA_HOME:/usr/local/cuda-xx.x

 更新环境变量

source ~/.bashrc

检测是否安装成功

nvcc -V

显示以下内容则说明安装成功

epyc-7302@epyc7302-S8030GM4NE-2T:~$ nvcc -V

nvcc: NVIDIA (R) Cuda compiler driver

Copyright (c) 2005-2023 NVIDIA Corporation

Built on Tue_Aug_15_22:02:13_PDT_2023

Cuda compilation tools, release 12.2, V12.2.140

Build cuda_12.2.r12.2/compiler.33191640_0

4.安装CUDNN工具包

前往官网下载CUDA对应的工具包

安装CUDNN软件包

sudo dpkg -i cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb

安装完成会显示需要执行的操作命令,执行对应的命令sudo cp /var/cudnn-local-repo-ubuntu2204-8.9.7.29/cudnn-local-08A7D361-keyring.gpg /usr/share/keyrings/

tiger@EPIC-7302-S8030GM4NE-2T:~$ sudo dpkg -i cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb

[sudo] tiger 的密码:

正在选中未选择的软件包 cudnn-local-repo-ubuntu2204-8.9.7.29。

(正在读取数据库 ... 系统当前共安装有 224517 个文件和目录。)

准备解压 cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb ...

正在解压 cudnn-local-repo-ubuntu2204-8.9.7.29 (1.0-1) ...

正在设置 cudnn-local-repo-ubuntu2204-8.9.7.29 (1.0-1) ...

The public cudnn-local-repo-ubuntu2204-8.9.7.29 GPG key does not appear to be installed.

To install the key, run this command:

sudo cp /var/cudnn-local-repo-ubuntu2204-8.9.7.29/cudnn-local-08A7D361-keyring.gpg /usr/share/keyrings/

之后进入文件夹cd /var/cudnn-local-repo-ubuntu2204-8.9.7.29/,安装对应的依赖包,这些依赖包是安装时生存的对应的deb文件,只能进入该目录使用dpkg安装

sudo dpkg -i libcudnn8_8.9.7.29-1+cuda12.2_amd64.deb

sudo dpkg -i libcudnn8-dev_8.9.7.29-1+cuda12.2_amd64.deb

sudo dpkg -i libcudnn8-samples_8.9.7.29-1+cuda12.2_amd64.deb

验证安装是否可用

进入文件夹cd /usr/src/cudnn_samples_v8,将示例复制到主目录

cp -r /usr/src/cudnn_samples_v8/ $HOME

进入mnistCUDNN文件夹并在终端打开,然后编译

make clean && make

若产生报错则安装依赖

sudo apt-get install libfreeimage-dev

运行mnistCUDNN

./mnistCUDNN

显示以下内容则说明安装可用

Loading image data/five_28x28.pgm

Performing forward propagation ...

Resulting weights from Softmax:

0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006

Result of classification: 1 3 5

Test passed!



声明

本文内容仅代表作者观点,或转载于其他网站,本站不以此文作为商业用途
如有涉及侵权,请联系本站进行删除
转载本站原创文章,请注明来源及作者。