Re: [PATCH v3 00/26] KVM: x86: Halt and APICv overhaul

From: Sean Christopherson
Date: Wed Dec 08 2021 - 19:03:05 EST


On Thu, Dec 09, 2021, Maxim Levitsky wrote:
> Also got this while trying a VM with passed through device:
>
> [mlevitsk@amdlaptop ~]$[ 34.926140] usb 5-3: reset full-speed USB device number 3 using xhci_hcd
> [ 42.583661] FAT-fs (mmcblk0p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
> [ 363.562173] VFIO - User Level meta-driver version: 0.3
> [ 365.160357] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x1e@0x154
> [ 384.138110] BUG: kernel NULL pointer dereference, address: 0000000000000021
> [ 384.154039] #PF: supervisor read access in kernel mode
> [ 384.165645] #PF: error_code(0x0000) - not-present page
> [ 384.177254] PGD 16da9d067 P4D 16da9d067 PUD 13ad1a067 PMD 0
> [ 384.190036] Oops: 0000 [#1] SMP
> [ 384.197117] CPU: 3 PID: 14403 Comm: CPU 3/KVM Tainted: G O 5.16.0-rc4.unstable #6
> [ 384.216978] Hardware name: LENOVO 20UF001CUS/20UF001CUS, BIOS R1CET65W(1.34 ) 06/17/2021
> [ 384.235258] RIP: 0010:amd_iommu_update_ga+0x32/0x160
> [ 384.246469] Code: <4c> 8b 62 20 48 8b 4a 18 4d 85 e4 0f 84 ca 00 00 00 48 85 c9 0f 84
> [ 384.288932] RSP: 0018:ffffc9000036fca0 EFLAGS: 00010046
> [ 384.300727] RAX: 0000000000000000 RBX: ffff88810b68ab60 RCX: ffff8881667a6018
> [ 384.316850] RDX: 0000000000000001 RSI: ffff888107476b00 RDI: 0000000000000003

RDX, a.k.a. ir_data is NULL. This check in svm_ir_list_add()

if (pi->ir_data && (pi->prev_ga_tag != 0)) {

implies pi->ir_data can be NULL, but neither avic_update_iommu_vcpu_affinity()
nor amd_iommu_update_ga() check ir->data for NULL.

amd_ir_set_vcpu_affinity() returns "success" without clearing pi.is_guest_mode

/* Note:
* This device has never been set up for guest mode.
* we should not modify the IRTE
*/
if (!dev_data || !dev_data->use_vapic)
return 0;

so it's plausible svm_ir_list_add() could add to the list with a NULL pi->ir_data.

But none of the relevant code has seen any meaningful changes since 5.15, so odds
are good I broke something :-/