...
BugZero found this defect 457 days ago.
Description of problem: The Q35 + OVMF VM with a mlx5_vfio_pci VF can not be migrated Version-Release number of selected component (if applicable): host: 5.14.0-355.el9.x86_64 qemu-kvm-8.0.0-13.el9.x86_64 libvirt-9.7.0-1.el9.x86_64 edk2-ovmf-20230524-3.el9.noarch VM: 5.14.0-355.el9.x86_64 How reproducible: 100% Steps to Reproduce: 1. create a MT2910 VF and setup the VF for migration 2. start a Q35 + OVMF VM with a mlx5_vfio_pci VF <domain type="kvm"> <name>rhel93</name> <uuid>9403cac2-9135-4d85-ab63-98bcdf8a5042</uuid> <memory>4194304</memory> <currentMemory>4194304</currentMemory> <vcpu>4</vcpu> <os firmware="efi"> <type arch="x86_64" machine="q35">hvm</type> <boot dev="hd"/> </os> <features> <acpi/> <apic/> </features> <cpu mode="host-model"/> <clock offset="utc"> <timer name="rtc" tickpolicy="catchup"/> <timer name="pit" tickpolicy="delay"/> <timer name="hpet" present="no"/> </clock> <pm> <suspend-to-mem enabled="no"/> <suspend-to-disk enabled="no"/> </pm> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type="file" device="disk"> <driver name="qemu" type="qcow2" cache="none" io="threads"/> <source file="/home/images/migration/RHEL93.qcow2"/> <target dev="vda" bus="virtio"/> </disk> <controller type="usb" model="ich9-ehci1"/> <controller type="usb" model="ich9-uhci1"> <master startport="0"/> </controller> <controller type="usb" model="ich9-uhci2"> <master startport="2"/> </controller> <controller type="usb" model="ich9-uhci3"> <master startport="4"/> </controller> <controller type="pci" model="pcie-root"/> <controller type="pci" model="pcie-root-port"/> <controller type="pci" model="pcie-root-port"/> <controller type="pci" model="pcie-root-port"/> <controller type="pci" model="pcie-root-port"/> <controller type="pci" model="pcie-root-port"/> <controller type="pci" model="pcie-root-port"/> <controller type="pci" model="pcie-root-port"/> <controller type="pci" model="pcie-root-port"/> <controller type="pci" model="pcie-root-port"/> <controller type="pci" model="pcie-root-port"/> <controller type="pci" model="pcie-root-port"/> <controller type="pci" model="pcie-root-port"/> <controller type="pci" model="pcie-root-port"/> <controller type="pci" model="pcie-root-port"/> <console type="pty"/> <input type="tablet" bus="usb"/> <tpm model="tpm-crb"> <backend type="emulator"/> </tpm> <graphics type="vnc" port="5993" listen="0.0.0.0"/> <video> <model type="bochs"/> </video> <hostdev mode="subsystem" type="pci" managed="no"> <source> <address domain="0" bus="0xb1" slot="0x0" function="0x02"/> </source> </hostdev> </devices> </domain> 3. check the MT2910 VF in the VM # ifconfig enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet6 fe80::b35b:2dac:371a:4fa3 prefixlen 64 scopeid 0x20<link> ether 52:54:00:01:01:01 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 28 bytes 4568 (4.4 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 # dmesg | grep -Ei "mlx5|enp2s0" [ 3.953753] mlx5_core 0000:02:00.0: firmware version: 28.37.1014 [ 4.112107] mlx5_core 0000:02:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps [ 4.217499] mlx5_core 0000:02:00.0: Supported tc offload range - chains: 1, prios: 1 [ 4.220999] mlx5_core 0000:02:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 enhanced) [ 4.242258] mlx5_core 0000:02:00.0 enp2s0: renamed from eth0 [ 5.361400] mlx5_core 0000:02:00.0 enp2s0: Link up [ 5.362911] IPv6: ADDRCONF(NETDEV_CHANGE): enp2s0: link becomes ready 4. migrate the VM $ sudo virsh migrate --verbose --live rhel93 qemu+ssh://10.8.3.15/system error: operation failed: job 'migration out' unexpectedly failed 5. check the qemu-kvm log $ sudo tail -f /var/log/libvirt/qemu/rhel93.log ... 2023-09-01 05:50:42.169+0000: initiating migration 2023-09-01T05:50:42.175652Z qemu-kvm: 0000:b1:00.2: Failed to start DMA logging, err -95 (Operation not supported) 2023-09-01T05:50:42.175777Z qemu-kvm: vfio: Could not start dirty page tracking, err: -95 (Operation not supported) 2023-09-01T05:50:42.378843Z qemu-kvm: Unable to read from socket: Bad file descriptor 2023-09-01T05:50:42.378866Z qemu-kvm: Unable to read from socket: Bad file descriptor 2023-09-01T05:50:42.378872Z qemu-kvm: Unable to read from socket: Bad file descriptor Actual results: The Q35 + OVMF VM with a mlx5_vfio_pci VF can not be migrated Expected results: The Q35 + OVMF VM with a mlx5_vfio_pci VF can be migrated Additional info: (1) How to create a MT2910 VF and setup the VF for migration ? 1.1 load the mlx5_vfio_pci module # modprobe mlx5_vfio_pci 1.2 create VF # sudo sh -c "echo 0 > /sys/bus/pci/devices/0000:b1:00.0/sriov_numvfs" # sudo sh -c "echo 1 > /sys/bus/pci/devices/0000:b1:00.0/sriov_numvfs" 1.3 set VF mac # sudo sh -c "ip link set ens2f0np0 vf 0 mac 52:54:00:01:01:01" 1.4 unbind created VF from driver # sudo sh -c "echo 0000:b1:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind" 1.5 set switchdev mode on PF # sudo sh -c "devlink dev eswitch set pci/0000:b1:00.0 mode switchdev" # sudo sh -c "devlink dev eswitch show pci/0000:b1:00.0" pci/0000:b1:00.0: mode switchdev inline-mode none encap-mode basic 1.6 enable VF's migration feature # sudo sh -c "devlink port function set pci/0000:b1:00.0/1 migratable enable" # sudo sh -c "devlink port show pci/0000:b1:00.0/1" … function: hw_addr 52:54:00:01:01:01 roce enable migratable enable 1.7 bind VF to mlx5_vfio_pci driver # sudo sh -c "echo '15b3 101e' > /sys/bus/pci/drivers/mlx5_vfio_pci/new_id" # sudo sh -c "echo '15b3 101e' > /sys/bus/pci/drivers/mlx5_vfio_pci/remove_id" # readlink -f /sys/bus/pci/devices/0000\:b1\:00.2/driver /sys/bus/pci/drivers/mlx5_vfio_pci
Done-Errata
9.5
swtpm: Windows reports TPM error due to missing SHA1 on RHEL 9 hosts9.1
[RHEL-9] On bootc installations, the kdump service fails during boot9.05
unexpected header length in /proc/net/snmp9
pc-q35-rhel9.4.0 does not provide proper computer information8.6
KeyError: '9' during Leapp preupgrade check_ipa_server from RHEL 9.5 to RHEL 10.0 beta9.5
swtpm: Windows reports TPM error due to missing SHA1 on RHEL 9 hosts9.35
Cross migration failed9.35
[vfio migration] The Q35 + OVMF VM with a mlx5_vfio_pci VF can not be migrated9.2
System cannot boot when usr is a separate file system with latest systemd-219-78.el7_9.89.2
VM can't autostart after Leapp upgrade