SR-IOV

SR-IOV(Single Root I/O Virtualization)是一个将PCIe共享给虚拟机的标准,通过为虚拟机提供独立的内存空间、中断、DMA流,来绕过VMM实现数据访问。SR-IOV基于两种PCIe functions:

  • PF (Physical Function): 包含完整的PCIe功能,包括SR-IOV的扩张能力,该功能用于SR-IOV的配置和管理。
  • FV (Virtual Function): 包含轻量级的PCIe功能。每一个VF有它自己独享的PCI配置区域,并且可能与其他VF共享着同一个物理资源

SR-IOV - 图1

SR-IOV要求

  • CPU 必须支持IOMMU(比如英特尔的 VT-d 或者AMD的 AMD-Vi,Power8 处理器默认支持IOMMU)
  • 固件Firmware 必须支持IOMMU
  • CPU 根桥必须支持 ACS 或者ACS等价特性
  • PCIe 设备必须支持ACS 或者ACS等价特性
  • 建议根桥和PCIe 设备中间的所有PCIe 交换设备都支持ACS,如果某个PCIe交换设备不支持ACS,其后的所有PCIe设备只能共享某个IOMMU 组,所以只能分配给1台虚机。

SR-IOV vs PCI path-through

SR-IOV - 图2

SR-IOV - 图3

SR-IOV - 图4

SR-IOV vs DPDK

1

2

3

SR-IOV使用示例

开启VF:

  1. modprobe -r igb
  2. modprobe igb max_vfs=7
  3. echo "options igb max_vfs=7" >>/etc/modprobe.d/igb.conf

查找Virtual Function:

  1. # lspci | grep 82576
  2. 0b:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
  3. 0b:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection(rev 01)
  4. 0b:10.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  5. 0b:10.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  6. 0b:10.2 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  7. 0b:10.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  8. 0b:10.4 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  9. 0b:10.5 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  10. 0b:10.6 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  11. 0b:10.7 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  12. 0b:11.0 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  13. 0b:11.1 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  14. 0b:11.2 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  15. 0b:11.3 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  16. 0b:11.4 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  17. 0b:11.5 Ethernet controller: Intel Corporation 82576 Virtual Function (rev 01)
  18. # virsh nodedev-list | grep 0b
  19. pci_0000_0b_00_0
  20. pci_0000_0b_00_1
  21. pci_0000_0b_10_0
  22. pci_0000_0b_10_1
  23. pci_0000_0b_10_2
  24. pci_0000_0b_10_3
  25. pci_0000_0b_10_4
  26. pci_0000_0b_10_5
  27. pci_0000_0b_10_6
  28. pci_0000_0b_11_7
  29. pci_0000_0b_11_1
  30. pci_0000_0b_11_2
  31. pci_0000_0b_11_3
  32. pci_0000_0b_11_4
  33. pci_0000_0b_11_5
  1. $ virsh nodedev-dumpxml pci_0000_0b_00_0
  2. <device>
  3. <name>pci_0000_0b_00_0</name>
  4. <parent>pci_0000_00_01_0</parent>
  5. <driver>
  6. <name>igb</name>
  7. </driver>
  8. <capability type='pci'>
  9. <domain>0</domain>
  10. <bus>11</bus>
  11. <slot>0</slot>
  12. <function>0</function>
  13. <product id='0x10c9'>82576 Gigabit Network Connection</product>
  14. <vendor id='0x8086'>Intel Corporation</vendor>
  15. </capability>
  16. </device>

通过libvirt绑定到虚拟机

  1. $ cat >/tmp/interface.xml <<EOF
  2. <interface type='hostdev' managed='yes'>
  3. <source>
  4. <address type='pci' domain='0' bus='11' slot='16' function='0'/>
  5. </source>
  6. </interface>
  7. EOF
  8. $ virsh attach-device MyGuest /tmp/interface. xml --live --config

当然也可以给网卡配置MAC地址和VLAN:

  1. <interface type='hostdev' managed='yes'>
  2. <source>
  3. <address type='pci' domain='0' bus='11' slot='16' function='0'/>
  4. </source>
  5. <mac address='52:54:00:6d:90:02'>
  6. <vlan>
  7. <tag id='42'/>
  8. </vlan>
  9. <virtualport type='802.1Qbh'>
  10. <parameters profileid='finance'/>
  11. </virtualport>
  12. </interface>

通过Qemu绑定到虚拟机

  1. /usr/bin/qemu-kvm -name vdisk -enable-kvm -m 512 -smp 2 \
  2. -hda /mnt/nfs/vdisk.img \
  3. -monitor stdio \
  4. -vnc 0.0.0.0:0 \
  5. -device pci-assign,host=0b:00.0

优缺点

Pros:

  • More Scalable than Direct Assign
  • Security through IOMMU and function isolation
  • Control Plane separation through PF/VF notion
  • High packet rate, Low CPU, Low latency thanks to Direct Pass through

Cons:

  • Rigid: Composability issues
  • Control plane is pass through, puts pressure on Hardware resources
  • Parts of the PCIe config space are direct map from Hardware
  • Limited scalability (16 bit)
  • SR-IOV NIC forces switching features into the HW
  • All the Switching Features in the Hardware or nothing

参考文档