Pre-requisite Enable PCI devices
- Create a harvester cluster in bare metal mode. Ensure one of the nodes has NIC separate from the management NIC
- Go to the management interface of the new cluster
- Go to Advanced -> PCI Devices
- Validate that the PCI devices aren’t enabled
- Click the link to enable PCI devices
- Enable PCI devices in the linked addon page
- Wait for the status to change to Deploy successful
- Navigate to the PCI devices page
- Validate that the PCI devices page is populated/populating with PCI devices
Case 1 (PCI NIC passthrough)
- Create a harvester cluster in bare metal mode. Ensure one of the nodes has NIC separate from the management NIC
- Go to the management interface of the new cluster
- Go to Advanced -> PCI Devices
- Check the box representing the PCI NIC device (identify it by the Description or the VendorId/DeviceId combination)
- Click Enable Passthrough
- When the NIC device is in an Enabled state, create a VM
- After creating the VM, edit the Config
- In the “PCI Devices” section, click the “Available PCI Devices” dropdown
- Select the PCI NIC device that has been enabled for passthrough
- Click Save
- Start the VM
- Once the VM is booted, run
lspci
at the command line (make sure the VM has thepciutils
package) and verify that the PCI NIC device shows up - (Optional) Install the driver for your PCI NIC device (if it hasn’t been autoloaded)
Case 1 dependencies:
- PCI NIC separate from management network
- Enable PCI devices
Case 2 (GPU passthrough)
Case 2-1 Add GPU
- Create a harvester cluster in bare metal mode. Ensure one of the nodes has a GPU separate from the management NIC
- Go to the management interface of the new cluster
- Go to Advanced -> PCI Devices
- Check the box representing the GPU device (identify it by the Description or the VendorId/DeviceId combination)
- Click Enable Passthrough
- When the GPU device is in an Enabled state, create a VM
- After creating the VM, edit the Config
- In the “PCI Devices” section, click the “Available PCI Devices” dropdown
- Select the GPU device that has been enabled for passthrough
- Click Save
- Start the VM
- Once the VM is booted, run
lspci
at the command line (make sure the VM has thepciutils
package) and verify that the GPU device shows up - Install the driver for your GPU device
- if the device is from NVIDIA: (this is for ubuntu, but the opensuse installation instructions are here)
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-keyring_1.1-1_all.deb sudo dpkg -i cuda-keyring_1.1-1_all.deb sudo apt-get update sudo apt-get -y install cuda nvidia-cuda-toolkit build-essential
- Check out https://github.com/nvidia/cuda-samples
git clone https://github.com/nvidia/cuda-samples
cd cuda-samples/Samples/3_CUDA_Features/cudaTensorCoreGemm
make
- If you need to install the drivers for the nvidia card you can use the following
sudo apt-get -y install ubuntu-drivers-common && sudo ubuntu-drivers autoinstall
- If that doesn’t work you might check for drivers that are available with
ubuntu-drivers-common
- Run
./cudaTensorCoreGemm
and verify that the program completed correctly
- Check out https://github.com/nvidia/cuda-samples
- if the device is from AMD/ATI, install and use the
aticonfig
command to inspect the device
- if the device is from NVIDIA: (this is for ubuntu, but the opensuse installation instructions are here)
Case 2-2 Negative add GPU
- Pre-requisite: the GPU should already be assigned to another VM
- On the VM that doesn’t have the GPU assigned edit the VM
- Open up the
PCI Devices
Section - Verify that you can’t add the already assigned GPU to the VM. It should be greyed out.
Case 2-3 Remove GPU
- Edit the VM where the GPU is assigned
- In the “PCI Devices” section, Clear the available devices
- Click save
- If the VM is on then you will be prompted to reboot.
- Validate that the GPU is removed by using
lspci
and trying to rundmmaTensorCoreGemm
if the GPU supported CUDA - Open up another VM and verify that the GPU is listed as available in the
PCI Devices
section