解决paddlepaddle-gpu和cuda兼容的问题

赫连达 2024-10-01 12:31:02 阅读 95

在使用paddlepaddle-gpu和paddlenlp跑模型的时候,可能会遇到下面2个问题:

OSError: (External) CUDA error(719), unspecified launch failure.[Hint: 'cudaErrorLaunchFailure'. 

An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointerand accessing out of bounds shared memory. Less common cases can be system specific - more information about these cases canbe found in the system specific user guide. This leaves the process in an inconsistent state and any further CUDA work willreturn the same error. To continue using CUDA, the process must be terminated and relaunched.] (at ../paddle/fluid/platform/device/gpu/gpu_info.cc:123)

OSError: (External) CUBLAS error(1).[Hint: 'CUBLAS_STATUS_NOT_INITIALIZED'.  The cuBLAS library was not initialized. This is usually caused by the lack of a prior cublasCreate() call, an error in the CUDA Runtime API called by the cuBLAS routine, or an error in the hardware setup.  To correct: call cublasCreate() prior to the function call; and check that the hardware, an appropriate version of the driver, and the cuBLAS library are correctly installed.  ] (at ../paddle/phi/backends/gpu/gpu_resources.cc:235)

  [operator < linear > error]

 经过多轮摸索和尝试,提供如下解决方案:

1、从经验来看,以下配置可以解决这个问题:(以下组合为个人尝试之后的解决方式,也可以自行探索其他可行的版本组合)(安装的时候请注意以下顺序)

(1)python=3.9(这个很重要,3.11和3.8都有问题)

(2)cudatoolkit=11.7(这个也很重要,cudatoolkit=其他版本都有问题)

(3)paddlenlp=2.8.1这个是我完成上述步骤之后再安装的

2、然后,直接使用官方安装方式即可,其他依赖包再后续安装

conda install paddlepaddle-gpu==2.6.1 cudatoolkit=11.7 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/ -c conda-forge

3、cudnn我没有单独安装,据说安装cudatoolkit的时候会自动包含在内



声明

本文内容仅代表作者观点,或转载于其他网站,本站不以此文作为商业用途
如有涉及侵权,请联系本站进行删除
转载本站原创文章,请注明来源及作者。