Installing NVIDIA drivers on Linux can be a bit tricky, especially when juggling CUDA, CUDNN, and ensuring compatibility. Here’s my firsthand account of how I successfully upgraded my Ubuntu system with the latest NVIDIA drivers, CUDA Toolkit, and CUDNN—all while avoiding common pitfalls!
1. Why Match CUDA and Driver Versions?
Before diving in, I knew it was crucial to align the NVIDIA driver version with the CUDA Toolkit. For my setup, I chose CUDA 12.6.0 paired with driver 560.28.03 (extracted from the CUDA package name). This combination ensures compatibility and optimal performance for my projects. I also downloaded CUDNN v8.9.2 (compatible with CUDA 12.6) from NVIDIA’s archives.
Download Links:
- CUDA Installer: CUDA 12.6 Archive
- Purpose: This is the CUDA Toolkit installer. CUDA (Compute Unified Device Architecture) is a parallel computing platform and API model created by NVIDIA.
- Function: Installing CUDA provides the necessary compiler, libraries, and runtime to develop and run GPU-accelerated applications. It includes tools like nvcc (CUDA compiler), CUDA libraries, and runtime environment.
- NVIDIA Driver: Extracted from the CUDA filename (
NVIDIA-Linux-x86_64-560.28.03.run
).- Purpose: This is the NVIDIA GPU driver installer for Linux. It contains the proprietary driver required for your system to interface with an NVIDIA GPU.
- Function: Installing this ensures that your system can properly utilize the GPU for graphics rendering and computational tasks (such as CUDA applications).
- CUDNN: CUDNN Archive.
- Purpose: This is cuDNN (CUDA Deep Neural Network) library, which is optimized for deep learning operations.
- Function: It provides highly optimized GPU implementations for neural network operations such as convolution, pooling, normalization, and activation functions. cuDNN is widely used in deep learning frameworks like TensorFlow and PyTorch.
2. Preparing the System: Stopping the GUI
To avoid conflicts with the running X server, I followed these steps:
- Close VNC/GUI Sessions: Ensure no graphical sessions are active.
- Stop the Display Manager:
sudo systemctl stop gdm.service # Replace 'gdm' with your DM (e.g., lightdm).
- Switch to Text Mode (Runlevel 3):
This switches the system to a text-based interface, killing the X server and preventing driver conflicts.
sudo init 3
3. Installing the NVIDIA Driver
Step 1: Install the NVIDIA Driver
Navigate to your downloads directory and make the installer executable:
cd ~/Downloads
chmod +x NVIDIA-Linux-x86_64-560.28.03.run
sudo ./NVIDIA-Linux-x86_64-560.28.03.run
The installer automatically detected my existing NVIDIA driver (530) and upgraded it to 560.28.03. Follow the on-screen prompts to complete installation.
4. Installing CUDA Toolkit 12.6
Step 1: Prepare the CUDA Installer
chmod +x cuda_12.6.0_560.28.03_linux.run
Step 2: Run the Installer
sudo ./cuda_12.6.0_560.28.03_linux.run
Here’s where I hit a snag! The installer failed when I selected the NVIDIA Kernel Module (nvidia-fs) option. To resolve this:
- Reran the installer without checking that option.
- Selected only essential components (Toolkit, Samples, etc.).
Troubleshooting Tip: If you encounter errors, try:
sudo apt-get update && sudo apt-get install linux-headers-$(uname -r)
to ensure kernel headers are installed.
5. Installing CUDNN
Step 1: Extract and Copy Files
tar -xzvf cudnn-linux-x86_64-8.9.2.26_cuda12-archive.tar.xz -C /tmp
sudo cp /tmp/cuda/include/cudnn* /usr/local/cuda-12.6/include/
sudo cp /tmp/cuda/lib/libcudnn* /usr/local/cuda-12.6/lib64/
Step 2: Set Environment Variables
Add these lines to your ~/.bashrc
or ~/.zshrc
:
export PATH=/usr/local/cuda-12.6/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.6/lib64:$LD_LIBRARY_PATH
Reload the shell:
source ~/.bashrc # or ~/.zshrc
6. Final Steps: Reboot and Verify
Step 1: Reboot Safely
Before rebooting, ensure no virtual machines or critical processes are running:
sudo systemctl stop libvirtd # If using KVM
sudo reboot
Step 2: Test the Setup
After rebooting, verify driver and CUDA versions:
nvidia-smi # Check driver version and GPU status
nvcc --version # Confirm CUDA Toolkit version
7. What I Learned
- Dependency Management: Install kernel headers and dependencies first to avoid surprises.
- Runlevel 3: Switching to text mode is critical for smooth driver installation.
- CUDNN Compatibility: Always match CUDNN with your CUDA version.
- Avoid Fancy Options: Stick to default installer options unless troubleshooting!
Final Thoughts
While NVIDIA driver installations can be intimidating, breaking down the steps and ensuring version compatibility made this process manageable. If you encounter errors, check NVIDIA’s Installation Guide or the logs in /var/log/nvidia-installer.log
. Happy computing!
Your Turn: Did you face similar challenges? Share your experiences in the comments!
Hope this guide helps you navigate the world of NVIDIA drivers on Linux! 🚀