文章目录
必须要了解的知识
1、vGPU
vGPU 技术通过将硬件 GPU 分割成多个虚拟 GPU 以支持多个虚拟机。每个虚拟 GPU 可以被分配给不同的虚拟机,从而使多个虚拟机拥有“独显”。
2、vGPU 授权机制
1、在宿主机上,
驱动无需授权。不过,只有企业级 GPU 能够被 vGPU 宿主机驱动所识别,消费级显卡无法合法地用于 vGPU 硬件
2、在虚拟机上
使用 vGPU 客户端驱动需要获得 NVIDIA 授权。不同能力的 vGPU 子设备被分为 A、B、C、Q 四类,并对应不同的授权费用。未获得授权的驱动会逐渐降低 vGPU 子设备的性能,最终导致无法使用。
3、在几种 vGPU 类型中,覆盖应用场景最广的是 Q 型 vGPU,授权为 vWS
分配 方案 | 授权类型 | 适用场景 | 特性 | 缺陷 | 驱动 |
---|---|---|---|---|---|
Q | vWS | 虚拟工作站,性能最好,专业用户 | 全功能支持(Directx、CUDA)、无显存限制 | ||
C | vCS or vWS | AI/机器学习场景 | 支持GPU 编程接口(支持 CUDA OpenCL)、显存最小为4G(划分粒度也为4G) | 无windows驱动(不支持windows) | |
B | vPC | 图形处理场景,虚拟桌面 | 支持Directx、显存最大可设为2G | 不支持GPU 编程接口(不支持 CUDA OpenCL) | 图形场景需安装GRID驱动、GPU驱动、CUDA版本、cuDNN版本 |
A | vApps |
3、vGPU-unlock
从硬件上来看,同代的数据中心显卡和消费者显卡 GPU 架构是一样的,只是驱动层限制了支持的特性。通过修改型号与宿主驱动等方式,将消费级显卡伪装成具有相同核心型号的专业卡,使得普通消费者显卡也可以支持虚拟化;对于虚拟机,同时在 vGPU 启动前后进行 hook,绕过特定检查、汇报虚假结果等,使得虚拟机的 vGPU 能正常启动。目前支持的消费卡型号可以参考 该文章
4、什么样的显卡支持vGPU
请查看NVIDIA官方介绍页
A、显卡直通
第一:主板BIOS需启用 IOMMU / VT-d、bove 4G Decoding、SR-IOV
- IOMMU:是一种地址映射技术,而 VT-d 是 Intel 对该技术的别称
- Above 4G Decoding:关系到 PCI-E 设备 RAM 的 64 位寻址能力,通常用于需要让 CPU 访问全部显存的场景,使用 vGPU 时推荐开启
- SR-IOV:允许一个 PCI-E 设备被多个虚拟机使用,常用于网卡等设备共享。
第二:确认CPU硬件是否支持虚拟化、关闭selinux
# 核查是否支持cpu虚拟化
egrep -o '(vmx|svm)' /proc/cpuinfo
# 关闭selinux
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
第三:CPU启用iommu
# intel_iommu=on iommu=pt # 注意 iommu=pt 并不会影响kvm|dpdk|spdk的性能,这三者本质上都是用户态驱动, 他只会影响内核驱动,能让内核驱动设备性能更高。
# Intel添加: rd.driver.pre=vfio-pci intel_iommu=on iommu=pt video=efifb:off,vesafb:off
# AMD添加: rd.driver.pre=vfio-pci amd_iommu=on iommu=pt video=efifb:off,vesafb:off
[root@sv-gpu-node-001 ~]# cat /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/cs-swap rd.lvm.lv=cs00/root rd.lvm.lv=cs/swap rhgb quiet rd.driver.pre=vfio-pci intel_iommu=on iommu=pt video=efifb:off,vesafb:off"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true
第四:更新grub
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
# 参数说明:
# vfio-pci 显卡直通虚拟话需要的驱动
# iommu开启直通分组
# efifb:off 禁用efi启动的显示设备
# vesafb:off 禁用legacy启动的显示设备
第五:加载显卡直通所需的驱动模块
cat > /etc/modules-load.d/vfio.conf << EOF
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
EOF
第六:禁用原本的英伟达显卡驱动和开源驱动nouveau,避免直通出错
cat > /etc/modprobe.d/blacklist.conf << EOF
blacklist nouveau
blacklist nvidia
options nouveau modeset=0
EOF
第七:重构
# 先备份
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r)-nouveau.img
# 重建
dracut /boot/initramfs-$(uname -r).img $(uname -r)
# 重启
reboot
第八:验证
# 验证IOMMU是否开启
[root@sv-gpu-node-001 ~]# dmesg | grep -e DMAR -e IOMMU
[ 0.000000] ACPI: DMAR 0x000000007DF6D650 000160 (v01 A M I OEMDMAR 00000001 INTL 00000001)
[ 0.000000] ACPI: Reserving DMAR table memory at [mem 0x7df6d650-0x7df6d7af]
[ 0.000000] DMAR: IOMMU enabled
# 这是未屏蔽开源显卡驱动的情况
[root@localhost ~]# lsmod | grep nouveau
nouveau 2355200 4
video 53248 1 nouveau
mxm_wmi 16384 1 nouveau
wmi 32768 2 mxm_wmi,nouveau
drm_display_helper 151552 1 nouveau
i2c_algo_bit 16384 2 ast,nouveau
drm_kms_helper 167936 5 drm_vram_helper,ast,drm_display_helper,nouveau
drm_ttm_helper 16384 3 drm_vram_helper,ast,nouveau
ttm 81920 3 drm_vram_helper,drm_ttm_helper,nouveau
drm 577536 13 drm_kms_helper,drm_vram_helper,ast,drm_display_helper,drm_ttm_helper,ttm,nouveau
# 这是已经成功屏蔽开源显卡驱动的情况(这是期待的结果-表示成功)
[root@localhost ~]# lsmod | grep nouveau
[root@localhost ~]#
# 如下分别是禁用nvidia驱动后,未分配给虚拟机使用 和 已分配给虚拟机使用 的情况
[root@localhost ~]# lspci -v -s 85:00.0
85:00.0 VGA compatible controller: NVIDIA Corporation GM107GL [Tesla M10] (rev a2) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation Tesla M10
...
Kernel driver in use: nvidia # 这是还未直通给虚拟机用时
Kernel modules: nouveau, nvidia_drm, nvidia
[root@localhost ~]# lspci -v -s 85:00.0
85:00.0 VGA compatible controller: NVIDIA Corporation GM107GL [Tesla M10] (rev a2) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation Tesla M10
...
Capabilities: [900] Secondary PCI Express
Kernel driver in use: vfio-pci # 这是已经直通给虚拟机用后
Kernel modules: nouveau, nvidia_drm, nvidia
B、(宿主机)安装 NVIDIA Host 驱动
第一:在显卡直通的基础上,安装必要的支持软件
# 安装必要软件(dkms好像不用装,这个包在epel源中)
dnf install epel-release -y
dnf install kernel-headers-$(uname -r)
dnf install kernel-devel-$(uname -r)
dnf install gcc gcc-c++ make dkms kernel-headers kernel-devel -y
# 重启服务器
reboot
第二:安装 NVIDIA Host 驱动 16.1 版本(注意这个和DLS本地授权服务器是有版本区别的)
NVIDIA 官网的驱动是非公开的,你需要注册 NVIDIA 商业账户才可访问
# 安装vgpu驱动
chmod +x NVIDIA-Linux-x86_64-535.104.06-vgpu-kvm.run
./NVIDIA-Linux-x86_64-535.104.06-vgpu-kvm.run
# 注意:上面的程序会在 /etc/systemd/system/multi-user.target.wants 目录下 创建两个服务
nvidia-vgpud.service -> /lib/systemd/system/nvidia-vgpud.service
nvidia-vgpu-mgr.service -> /lib/systemd/system/nvidia-vgpu-mgr.service
# 先查看 这两个服务是否正常
systemctl status nvidia-vgpu-mgr.service # 我看到 这个是运行的
systemctl status nvidia-vgpud.service # 我看到 这个是故障的(不清楚为什么,暂时不管)
# 安装完成后验证(验证 NVIDIA 内核驱动程序是否可以与系统中的 NVIDIA 物理 GPU 成功通信)
[root@sv-gpu-node-001 Public]# nvidia-smi
Sun Sep 24 08:24:33 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.06 Driver Version: 535.104.06 CUDA Version: N/A |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla M10 On | 00000000:85:00.0 Off | N/A |
| N/A 53C P8 11W / 53W | 13MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 Tesla M10 On | 00000000:86:00.0 Off | N/A |
| N/A 60C P8 11W / 53W | 13MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 2 Tesla M10 On | 00000000:87:00.0 Off | N/A |
| N/A 47C P8 11W / 53W | 13MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 3 Tesla M10 On | 00000000:88:00.0 Off | N/A |
| N/A 57C P8 11W / 53W | 13MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
# 或者使用 精简查询命令
[root@sv-gpu-node-001 ~]# nvidia-smi vgpu
Sun Nov 5 23:24:28 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.06 Driver Version: 535.104.06 |
|---------------------------------+------------------------------+------------+
| GPU Name | Bus-Id | GPU-Util |
| vGPU ID Name | VM ID VM Name | vGPU-Util |
|=================================+==============================+============|
| 0 Tesla M10 | 00000000:85:00.0 | 0% |
+---------------------------------+------------------------------+------------+
| 1 Tesla M10 | 00000000:86:00.0 | 0% |
+---------------------------------+------------------------------+------------+
| 2 Tesla M10 | 00000000:87:00.0 | 0% |
+---------------------------------+------------------------------+------------+
| 3 Tesla M10 | 00000000:88:00.0 | 0% |
+---------------------------------+------------------------------+------------+
# 验证相关模块是否被成功安装
# 注意:Kernel 5.15开始, mdev 模块取代了 vfio_mdev ,依然可以在 kernel 5.15 上通过 mdev 来使用 vfio
# https://forum.proxmox.com/threads/was-the-vfio_mdev-module-removed-from-the-5-15-kernel.111335/
# https://cloud-atlas.readthedocs.io/zh-cn/latest/kvm/vgpu/install_vgpu_manager.html
[root@sv-gpu-node-001 ~]# lsmod | grep vfio
nvidia_vgpu_vfio 73728 0
vfio_mdev 16384 0
mdev 24576 2 vfio_mdev,nvidia_vgpu_vfio
kvm 905216 2 nvidia_vgpu_vfio,kvm_intel
vfio_pci 61440 0
irqbypass 16384 15 nvidia_vgpu_vfio,vfio_pci,kvm
vfio_virqfd 16384 1 vfio_pci
vfio_iommu_type1 36864 0
vfio 36864 4 vfio_mdev,nvidia_vgpu_vfio,vfio_iommu_type1,vfio_pci
C、(宿主机)根据 NVIDIA 关于vGPU的分配方案 创建vGPU 设备
第一:查看所有生成了 mdev 的 PCI 设备 ID
该处的 mdev 字段,后面半段的数字为显存容量,字母为 vGPU 类型,这里选 Q 全能型即可,显存按需分配。
# 查看所有生成了 mdev 的 PCI 设备 的BDF(即硬件地址)
[root@sv-gpu-node-001 ~]# mdevctl types | grep '^[^ ]*$'
0000:85:00.0
0000:86:00.0
0000:87:00.0
0000:88:00.0
# 查询我们 GPU 支持的 mdev 类型列表
[root@sv-gpu-node-001 ~]# ls /sys/class/mdev_bus/*/mdev_supported_types
'/sys/class/mdev_bus/0000:85:00.0/mdev_supported_types':
nvidia-155 nvidia-208 nvidia-240 nvidia-35 nvidia-36 nvidia-37 nvidia-38 nvidia-39 nvidia-40 nvidia-41 nvidia-42 nvidia-43 nvidia-44 nvidia-45
'/sys/class/mdev_bus/0000:86:00.0/mdev_supported_types':
nvidia-155 nvidia-208 nvidia-240 nvidia-35 nvidia-36 nvidia-37 nvidia-38 nvidia-39 nvidia-40 nvidia-41 nvidia-42 nvidia-43 nvidia-44 nvidia-45
'/sys/class/mdev_bus/0000:87:00.0/mdev_supported_types':
nvidia-155 nvidia-208 nvidia-240 nvidia-35 nvidia-36 nvidia-37 nvidia-38 nvidia-39 nvidia-40 nvidia-41 nvidia-42 nvidia-43 nvidia-44 nvidia-45
'/sys/class/mdev_bus/0000:88:00.0/mdev_supported_types':
nvidia-155 nvidia-208 nvidia-240 nvidia-35 nvidia-36 nvidia-37 nvidia-38 nvidia-39 nvidia-40 nvidia-41 nvidia-42 nvidia-43 nvidia-44 nvidia-45
# 查看具体的 mdevctl type
[root@sv-gpu-node-001 ~]# mdevctl types
0000:85:00.0
nvidia-155
Available instances: 4
Device API: vfio-pci
Name: GRID M10-2B
Description: num_heads=4, frl_config=45, framebuffer=2048M, max_resolution=5120x2880, max_instance=4
nvidia-208
Available instances: 4
Device API: vfio-pci
Name: GRID M10-2B4
Description: num_heads=4, frl_config=45, framebuffer=2048M, max_resolution=5120x2880, max_instance=4
nvidia-240
Available instances: 8
Device API: vfio-pci
Name: GRID M10-1B4
Description: num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=8
nvidia-35
Available instances: 16
Device API: vfio-pci
Name: GRID M10-0B
Description: num_heads=2, frl_config=45, framebuffer=512M, max_resolution=2560x1600, max_instance=16
nvidia-36
Available instances: 16
Device API: vfio-pci
Name: GRID M10-0Q
Description: num_heads=2, frl_config=60, framebuffer=512M, max_resolution=2560x1600, max_instance=16
nvidia-37
Available instances: 8
Device API: vfio-pci
Name: GRID M10-1A
Description: num_heads=1, frl_config=60, framebuffer=1024M, max_resolution=1280x1024, max_instance=8
nvidia-38
Available instances: 8
Device API: vfio-pci
Name: GRID M10-1B
Description: num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=8
nvidia-39
Available instances: 8
Device API: vfio-pci
Name: GRID M10-1Q
Description: num_heads=4, frl_config=60, framebuffer=1024M, max_resolution=5120x2880, max_instance=8
nvidia-40
Available instances: 4
Device API: vfio-pci
Name: GRID M10-2A
Description: num_heads=1, frl_config=60, framebuffer=2048M, max_resolution=1280x1024, max_instance=4
nvidia-41
Available instances: 4
Device API: vfio-pci
Name: GRID M10-2Q
Description: num_heads=4, frl_config=60, framebuffer=2048M, max_resolution=5120x2880, max_instance=4
nvidia-42
Available instances: 2
Device API: vfio-pci
Name: GRID M10-4A
Description: num_heads=1, frl_config=60, framebuffer=4096M, max_resolution=1280x1024, max_instance=2
nvidia-43
Available instances: 2
Device API: vfio-pci
Name: GRID M10-4Q
Description: num_heads=4, frl_config=60, framebuffer=4096M, max_resolution=5120x2880, max_instance=2
nvidia-44
Available instances: 1
Device API: vfio-pci
Name: GRID M10-8A
Description: num_heads=1, frl_config=60, framebuffer=8192M, max_resolution=1280x1024, max_instance=1
nvidia-45
Available instances: 1
Device API: vfio-pci
Name: GRID M10-8Q
Description: num_heads=4, frl_config=60, framebuffer=8192M, max_resolution=5120x2880, max_instance=1
# nvidia-45 -->vgpu 类型
# Available instances --->可用的设备数
# Name --->显示名
# Description --->描述,framebuffer 显存,frl 应该是最大 fps,分辨率,最多的设备
第二:使用mdevctl 管理 vGPU(mdev) 的 [创建|销毁]
# 我们用 提供 M10-4A 方案的 nvidia-42 (这种方案 在 M10 显卡上 支持 2 个实例)
# 还是一样我要先定位 物理GPU的 BDF (其实这个 nvidia-smi 命令看到的 00000000:86:00.0 忽略前面的 4个0 就是我们要的BDF)
# 使用uuid工具生成两个随机uuid ,将用于 mdev 设备标识
[root@sv-gpu-node-001 ~]# dnf install -y uuid
[root@sv-gpu-node-001 ~]# uuid -n 2
05ca1b76-7c87-11ee-9add-6ed335bf9381
05ca1c66-7c87-11ee-9ade-6ed335bf9381
# 创建 vGPU profile,采用 mdevctl start 命令
# -a 等于 --auto
# -p 等于 --pci
# -t 等于 --type
# -u 等于 --uuid
[root@sv-gpu-node-001 ~]# mdevctl start -u 05ca1b76-7c87-11ee-9add-6ed335bf9381 -p 0000:86:00.0 -t nvidia-42
[root@sv-gpu-node-001 ~]# mdevctl start -u 05ca1c66-7c87-11ee-9ade-6ed335bf9381 -p 0000:86:00.0 -t nvidia-42
# 检查所有vGPU设备
[root@sv-gpu-node-001 ~]# mdevctl list
05ca1b76-7c87-11ee-9add-6ed335bf9381 0000:86:00.0 nvidia-42 manual
05ca1c66-7c87-11ee-9ade-6ed335bf9381 0000:86:00.0 nvidia-42 manual
# 将 profile 持久化,(只需要使用 mdevctl define -a -u UUID 就可以)
[root@sv-gpu-node-001 ~]# mdevctl define -a -u 05ca1b76-7c87-11ee-9add-6ed335bf9381
[root@sv-gpu-node-001 ~]# mdevctl define -a -u 05ca1c66-7c87-11ee-9ade-6ed335bf9381
[root@sv-gpu-node-001 ~]# mdevctl list
05ca1b76-7c87-11ee-9add-6ed335bf9381 0000:86:00.0 nvidia-42 auto (defined)
05ca1c66-7c87-11ee-9ade-6ed335bf9381 0000:86:00.0 nvidia-42 auto (defined)
# 若需取消持久化,并删除 可以这样做
# mdevctl undefine -u 05ca1c66-7c87-11ee-9ade-6ed335bf9381
# mdevctl stop -u 05ca1c66-7c87-11ee-9ade-6ed335bf9381
# 使用完整GPU标识,通过 virsh nodedev-dumpxml 获得完整的GPU配置
# pci_0000_86_00_0 这个是不是很熟悉,0000:86:00.0
[root@sv-gpu-node-001 ~]# virsh nodedev-dumpxml pci_0000_86_00_0
<device>
<name>pci_0000_86_00_0</name>
<path>/sys/devices/pci0000:80/0000:80:03.0/0000:83:00.0/0000:84:09.0/0000:86:00.0</path>
<parent>pci_0000_84_09_0</parent>
<driver>
<name>nvidia</name>
</driver>
<capability type='pci'>
<class>0x030000</class>
<domain>0</domain>
<bus>134</bus>
<slot>0</slot>
<function>0</function>
<product id='0x13bd'>GM107GL [Tesla M10]</product>
<vendor id='0x10de'>NVIDIA Corporation</vendor>
<capability type='mdev_types'>
<type id='nvidia-41'>
<name>GRID M10-2Q</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>0</availableInstances>
</type>
<type id='nvidia-155'>
<name>GRID M10-2B</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>0</availableInstances>
</type>
<type id='nvidia-38'>
<name>GRID M10-1B</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>0</availableInstances>
</type>
<type id='nvidia-36'>
<name>GRID M10-0Q</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>0</availableInstances>
</type>
<type id='nvidia-208'>
<name>GRID M10-2B4</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>0</availableInstances>
</type>
<type id='nvidia-44'>
<name>GRID M10-8A</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>0</availableInstances>
</type>
<type id='nvidia-42'>
<name>GRID M10-4A</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>1</availableInstances>
</type>
<type id='nvidia-40'>
<name>GRID M10-2A</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>0</availableInstances>
</type>
<type id='nvidia-240'>
<name>GRID M10-1B4</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>0</availableInstances>
</type>
<type id='nvidia-39'>
<name>GRID M10-1Q</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>0</availableInstances>
</type>
<type id='nvidia-37'>
<name>GRID M10-1A</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>0</availableInstances>
</type>
<type id='nvidia-45'>
<name>GRID M10-8Q</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>0</availableInstances>
</type>
<type id='nvidia-35'>
<name>GRID M10-0B</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>0</availableInstances>
</type>
<type id='nvidia-43'>
<name>GRID M10-4Q</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>1</availableInstances>
</type>
</capability>
<iommuGroup number='77'>
<address domain='0x0000' bus='0x86' slot='0x00' function='0x0'/>
</iommuGroup>
<numa node='1'/>
<pci-express>
<link validity='cap' port='9' speed='8' width='16'/>
<link validity='sta' speed='2.5' width='8'/>
</pci-express>
</capability>
</device>
# 使用完整GPU标识,通过 virsh nodedev-dumpxml 获得完整的GPU配置(domain, bus, slot 以及 function)
[root@sv-gpu-node-001 ~]# virsh nodedev-dumpxml pci_0000_86_00_0 | egrep 'domain|bus|slot|function'
<domain>0</domain>
<bus>134</bus>
<slot>0</slot>
<function>0</function>
<address domain='0x0000' bus='0x86' slot='0x00' function='0x0'/>
# 显示所有已经激活的mdev设备(如果要显示没有激活的,则命令添加 --inactive )
[root@sv-gpu-node-001 ~]# virsh nodedev-list --cap mdev
mdev_05ca1b76-7c87-11ee-9add-6ed335bf9381_0000_85_00_0
mdev_05ca1b76_7c87_11ee_9ade_6ed335bf9381_0000_86_00_0
[root@sv-gpu-node-001 ~]# mdevctl list -d
05ca1b76-7c87-11ee-9add-6ed335bf9381 0000:86:00.0 nvidia-42 auto (active)
05ca1c66-7c87-11ee-9ade-6ed335bf9381 0000:86:00.0 nvidia-42 auto (active)
第三:在kvm虚拟机的xml中关联 vGPU配置
# 单虚拟机 单vGPU情况
# 使用 Q 系列(虚拟工作站),则配置 display='on' ,如果是 C 系列(机器学习),则配置 display='off'
<hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci' display='off'>
<source>
<address uuid='05ca1b76-7c87-11ee-9add-6ed335bf9381'/>
</source>
</hostdev>
# 或者这样写(不推荐,启动虚拟机,libvirtd会报错,说slot必须大于0)
<hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci' display='off'>
<source>
<address uuid='05ca1b76-7c87-11ee-9add-6ed335bf9381'/>
</source>
<address type='pci' domain='0x0000' bus='0x86' slot='0x00' function='0x0'/>
</hostdev>
# 单虚拟机,多vGPU情况
# 若是需要将多张vGPU绑定到一个虚拟机,重复上面的写法就可以了(但是注意:绑定到同一虚拟机的多张vGPU必须是一个分配方案类型的,不能是混搭)
<hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci' display='off'>
<source>
<address uuid='05ca1b76-7c87-11ee-9add-6ed335bf9381'/>
</source>
</hostdev>
<hostdev mode='subsystem' type='mdev' managed='no' model='vfio-pci' display='off'>
<source>
<address uuid='05ca1c66-7c87-11ee-9ade-6ed335bf9381'/>
</source>
</hostdev>
# 启动虚拟机后,进入计算节点,查看GPU分配情况
[root@sv-gpu-node-001 ~]# nvidia-smi
Mon Nov 6 19:54:22 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.06 Driver Version: 535.104.06 CUDA Version: N/A |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla M10 On | 00000000:85:00.0 Off | N/A |
| N/A 49C P8 11W / 53W | 13MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 Tesla M10 On | 00000000:86:00.0 Off | N/A |
| N/A 54C P8 11W / 53W | 8137MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 2 Tesla M10 On | 00000000:87:00.0 Off | N/A |
| N/A 42C P8 11W / 53W | 13MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 3 Tesla M10 On | 00000000:88:00.0 Off | N/A |
| N/A 49C P8 11W / 53W | 13MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 1 N/A N/A 10250 C+G vgpu 4062MiB |
| 1 N/A N/A 14814 C+G vgpu 4062MiB |
+---------------------------------------------------------------------------------------+
[root@sv-gpu-node-001 ~]# nvidia-smi vgpu
Mon Nov 6 19:55:06 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.06 Driver Version: 535.104.06 |
|---------------------------------+------------------------------+------------+
| GPU Name | Bus-Id | GPU-Util |
| vGPU ID Name | VM ID VM Name | vGPU-Util |
|=================================+==============================+============|
| 0 Tesla M10 | 00000000:85:00.0 | 0% |
+---------------------------------+------------------------------+------------+
| 1 Tesla M10 | 00000000:86:00.0 | 0% |
| 3251634200 GRID M10-4A | 2297... kvm9403 | 0% |
| 3251634206 GRID M10-4A | cf8c... kvm9410 | 0% |
+---------------------------------+------------------------------+------------+
| 2 Tesla M10 | 00000000:87:00.0 | 0% |
+---------------------------------+------------------------------+------------+
| 3 Tesla M10 | 00000000:88:00.0 | 0% |
+---------------------------------+------------------------------+------------+
# 查看vGPU的pid是多少(计算节点上)
[root@sv-gpu-node-001 ~]# nvidia-smi vgpu -caa
Cleared Accounted PIDs for vGPU 3251634200
Cleared Accounted PIDs for vGPU 3251634206
D、Linux 虚拟机 安装 grid驱动 并完成授权配置
Linux 系统驱动安装失败表现为 nvidia-smi 无法工作,通常原因如下:
1. 系统缺乏编译 kernel module 所需要的包,如 gcc,kernel-devel-xxx 等,导致无法编译,最终安装失败。
2. 系统里面存在多个版本的 kernel,由于 DKMS 的不正确配置,导致驱动编译为非当前版本 kernel 的 kernel module,导致 kernel module 安装失败。
3. 安装驱动后,升级了 kernel 版本导致原来的安装失效。
CentOS 系列
dnf install -y dkms
rpm -qa | grep -i dkms
rpm -qa | grep kernel-devel
rpm -qa | grep gcc
dnf install -y gcc kernel-devel
chmod +x NVIDIA-Linux-x86_64-535.104.05-grid.run
./NVIDIA-Linux-x86_64-535.104.05-grid.run --disable-nouveau
# 查看驱动程序是否工作正常
[root@C20231106136713 ~]# nvidia-smi
Tue Nov 7 14:45:03 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 GRID M10-4A On | 00000000:00:06.0 Off | N/A |
| N/A N/A P8 N/A / N/A | 0MiB / 4096MiB | 0% Prohibited |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
# 当前 虚拟机的vgpu还没授权
[root@C20231106136713 ~]# nvidia-smi -q | grep License
vGPU Software Licensed Product
License Status : Unlicensed (Unrestricted)
# 配置授权
图形处理器类型 值
NVIDIA vGPU 1. NVIDIA vGPU 软件根据 vGPU 类型自动选择正确的许可证类型。
物理GPU 直通模式或裸机部署的GPU的功能类型:
0:NVIDIA 虚拟应用程序
2:NVIDIA RTX虚拟工作站
4:NVIDIA 虚拟计算服务器
# 用模板复制一份为配置文件,并编辑
cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf
vi /etc/nvidia/gridd.conf # 编辑 FeatureType 字段,改成对应你需要的上面的类型数字
# 将客户端配置令牌复制到 /etc/nvidia/ClientConfigToken 目录,放置出现问题,赋值权限744
cd /etc/nvidia/ClientConfigToken
chmod 744 client_configuration_token_11-07-2023-15-48-22.tok
# 重启 nvidia-gridd 服务
systemctl restart nvidia-gridd.service
# 验证授权
[root@C20231106136713 ~]# nvidia-smi
Tue Nov 7 16:08:55 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 GRID M10-4A On | 00000000:00:06.0 Off | N/A |
| N/A N/A P8 N/A / N/A | 0MiB / 4096MiB | 0% Prohibited |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
[root@C20231106136713 ~]# nvidia-smi -q | grep License
vGPU Software Licensed Product
License Status : Licensed (Expiry: 2023-11-8 8:8:29 GMT)
Ubuntu 系列
# 安装必要的支持软件
apt install dkms gcc make
# 运行驱动程序
./NVIDIA-Linux-x86_64-535.104.05-grid.run --disable-nouveau
# 调整grid配置文件
cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf
vi /etc/nvidia/gridd.conf
# 上传tok文件
cd /etc/nvidia/ClientConfigToken
# 重启系统,并验证
root@C20231106136713:~# nvidia-smi -q | grep License
vGPU Software Licensed Product
License Status : Licensed (Expiry: 2023-11-14 23:59:59 GMT)
root@C20231106136713:~# nvidia-smi
Tue Nov 14 09:16:04 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 GRID M10-4A On | 00000000:00:06.0 Off | N/A |
| N/A N/A P8 N/A / N/A | 0MiB / 4096MiB | 0% Prohibited |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
E、windows 虚拟机安装 grid驱动 并完成授权配置
# 先完成 grid驱动的安装
# 将客户端配置令牌tok文件,复制到 %SystemDrive%:\Program Files\NVIDIA Corporation\vGPU Licensing\ClientConfigToken 文件夹。
# 重新启动 NvDisplayContainer 服务
附:如何 手动 创建vGPU设备(不推荐)
# 这里是 M10-2B 是 mdevctl types 输出中找到6GB显存规格的GRID名字
# 这里的M10-2B 是 事先确定好的方案,当然你也可以使用别的方案
domain=0000
bus=85
slot=00
function=0
# 将上面的代入 /sys/class/mdev_bus/${domain}\:${bus}\:${slot}.${function}/mdev_supported_types/ 会得到
[root@sv-gpu-node-001 nvidia-155]# cd /sys/class/mdev_bus/0000:85:00.0/mdev_supported_types/
[root@sv-gpu-node-001 mdev_supported_types]# grep -l M10-2B nvidia-*/name
nvidia-155/name
nvidia-208/name
[root@sv-gpu-node-001 nvidia-155]# pwd
/sys/class/mdev_bus/0000:85:00.0/mdev_supported_types/nvidia-155
[root@sv-gpu-node-001 nvidia-155]# cat name
GRID M10-2B
# 检查vGPU类型对应的可创建实例数量(每创建一个vGPU就会减1)
[root@sv-gpu-node-001 nvidia-155]# cat available_instances
4
# 创建vGPU设备的方法是向该规格 /sys/class/mdev_bus/${domain}\:${bus}\:${slot}.${function}/mdev_supported_types/nvidia-155/ 目录下的 create 文件输入一个随机uuid
# 即 /sys/class/mdev_bus/0000:85:00.0/mdev_supported_types/nvidia-155/create
cd /sys/class/mdev_bus/0000:85:00.0/mdev_supported_types/nvidia-155/
UUID=`uuidgen`
echo "$UUID" > create
# 创建vGPU设备
[root@sv-gpu-node-001 mdev_supported_types]# cd /sys/class/mdev_bus/0000:85:00.0/mdev_supported_types/nvidia-155/
[root@sv-gpu-node-001 nvidia-155]# ls
available_instances create description device_api devices name
[root@sv-gpu-node-001 nvidia-155]# UUID=`uuidgen`
[root@sv-gpu-node-001 nvidia-155]# echo "$UUID" > create
[root@sv-gpu-node-001 nvidia-155]# cat available_instances
3
# 检查所有的 vGPU(mdevctl) 实例
[root@sv-gpu-node-001 nvidia-155]# mdevctl list
1c3e1fe7-6fc7-4c60-a95b-af1bd4345e06 0000:85:00.0 nvidia-155 manual
# 检查vGPU设备(可以看到 /sys/bus/mdev/devices/ 目录下增加了新的虚拟vGPU设备软连接)
[root@sv-gpu-node-001 mdev_supported_types]# ls -lh /sys/bus/mdev/devices/
total 0
lrwxrwxrwx 1 root root 0 Nov 6 00:48 1c3e1fe7-6fc7-4c60-a95b-af1bd4345e06 -> ../../../devices/pci0000:80/0000:80:03.0/0000:83:00.0/0000:84:08.0/0000:85:00.0/1c3e1fe7-6fc7-4c60-a95b-af1bd4345e06
# 再 重复三次 创建vGPU设备(手动方式创建的重启系统就没了)
[root@sv-gpu-node-001 nvidia-155]# UUID=`uuidgen`
[root@sv-gpu-node-001 nvidia-155]# echo "$UUID" > create
[root@sv-gpu-node-001 nvidia-155]# UUID=`uuidgen`
[root@sv-gpu-node-001 nvidia-155]# echo "$UUID" > create
[root@sv-gpu-node-001 nvidia-155]# UUID=`uuidgen`
[root@sv-gpu-node-001 nvidia-155]# echo "$UUID" > create
[root@sv-gpu-node-001 nvidia-155]# cat available_instances
0
# 再次检查所有vGPU设备
[root@sv-gpu-node-001 nvidia-155]# mdevctl list
00ca4a0a-1ff1-48a8-9ad8-9224eccbd784 0000:85:00.0 nvidia-155 manual
1c3e1fe7-6fc7-4c60-a95b-af1bd4345e06 0000:85:00.0 nvidia-155 manual
50481072-70ec-4071-9ccb-f1067ade8a48 0000:85:00.0 nvidia-155 manual
ff6f05f3-e96f-4b45-b61f-fa083ff89871 0000:85:00.0 nvidia-155 manual
附:GPU 测试软件
https://benchmark.unigine.com/valley
附:其他参考教程
# 英伟达企业登录接口
https://nvid.nvidia.com
# 向英伟达企业进行申请试用授权(步骤 教程)
https://mp.weixin.qq.com/s/a6U1-GFAM9jXoLfvxO6nGA
# NVIDIA vGPU文档导航(中文)
http://vgpu.com.cn/
# NVIDIA vGPU官方文档
https://docs.nvidia.com/
https://cloud-atlas.readthedocs.io/zh-cn/latest/kvm/vgpu/install_vgpu_manager.html
https://github.com/mdevctl/mdevctl
https://docs.nvidia.com/grid/10.0/grid-vgpu-user-guide/index.html#creating-vgpu-device-red-hat-el-kvm
https://fairysen.com/844.html
https://pve.sqlsec.com/4/5/ # 这里提供了驱动等下载方式
https://foxi.buduanwang.vip/virtualization/1683.html/
如果文章对你有帮助,欢迎点击上方按钮打赏作者
暂无评论