PCI Devices Controller

Pre-requisite Enable PCI devices

Create a harvester cluster in bare metal mode. Ensure one of the nodes has NIC separate from the management NIC
Go to the management interface of the new cluster
Go to Advanced -> PCI Devices
Validate that the PCI devices aren’t enabled
Click the link to enable PCI devices
Enable PCI devices in the linked addon page
Wait for the status to change to Deploy successful
Navigate to the PCI devices page
Validate that the PCI devices page is populated/populating with PCI devices

Case 1 (PCI NIC passthrough)

Create a harvester cluster in bare metal mode. Ensure one of the nodes has NIC separate from the management NIC
Go to the management interface of the new cluster
Go to Advanced -> PCI Devices
Check the box representing the PCI NIC device (identify it by the Description or the VendorId/DeviceId combination)
Click Enable Passthrough
When the NIC device is in an Enabled state, create a VM
After creating the VM, edit the Config
In the “PCI Devices” section, click the “Available PCI Devices” dropdown
Select the PCI NIC device that has been enabled for passthrough
Click Save
Start the VM
Once the VM is booted, run lspci at the command line (make sure the VM has the pciutils package) and verify that the PCI NIC device shows up
(Optional) Install the driver for your PCI NIC device (if it hasn’t been autoloaded)

Case 1 dependencies:

PCI NIC separate from management network
Enable PCI devices

Case 2 (GPU passthrough)

Case 2-1 Add GPU

Create a harvester cluster in bare metal mode. Ensure one of the nodes has a GPU separate from the management NIC
Go to the management interface of the new cluster
Go to Advanced -> PCI Devices
Check the box representing the GPU device (identify it by the Description or the VendorId/DeviceId combination)
Click Enable Passthrough
When the GPU device is in an Enabled state, create a VM
After creating the VM, edit the Config
In the “PCI Devices” section, click the “Available PCI Devices” dropdown
Select the GPU device that has been enabled for passthrough
Click Save
Start the VM
Once the VM is booted, run lspci at the command line (make sure the VM has the pciutils package) and verify that the GPU device shows up
Install the driver for your GPU device
1. if the device is from NVIDIA: (this is for ubuntu, but the opensuse installation instructions are here)
```
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda nvidia-cuda-toolkit build-essential
```
  1. Check out https://github.com/nvidia/cuda-samples git clone https://github.com/nvidia/cuda-samples
  2. cd cuda-samples/Samples/3_CUDA_Features/cudaTensorCoreGemm
  3. make
  4. If you need to install the drivers for the nvidia card you can use the following
    - sudo apt-get -y install ubuntu-drivers-common && sudo ubuntu-drivers autoinstall
    - If that doesn’t work you might check for drivers that are available with ubuntu-drivers-common
  5. Run ./cudaTensorCoreGemm and verify that the program completed correctly
2. if the device is from AMD/ATI, install and use the aticonfig command to inspect the device

Case 2-2 Negative add GPU

Pre-requisite: the GPU should already be assigned to another VM

On the VM that doesn’t have the GPU assigned edit the VM
Open up the PCI Devices Section
Verify that you can’t add the already assigned GPU to the VM. It should be greyed out.

Case 2-3 Remove GPU

Edit the VM where the GPU is assigned
In the “PCI Devices” section, Clear the available devices
Click save
- If the VM is on then you will be prompted to reboot.
Validate that the GPU is removed by using lspci and trying to run dmmaTensorCoreGemm if the GPU supported CUDA
Open up another VM and verify that the GPU is listed as available in the PCI Devices section