Re: [patch V2 00/46] x86, PCI, XEN, genirq ...: Prepare for device MSI

From: Peter Zijlstra
Date: Fri Sep 25 2020 - 11:50:31 EST


On Fri, Sep 25, 2020 at 11:29:13AM -0400, Qian Cai wrote:

> It looks like the crashes happen in the interrupt remapping code where they are
> only able to to generate partial call traces.

> [ 8.466614][ T0] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [ 8.474295][ T0] #PF: supervisor instruction fetch in kernel mode
> [ 8.480669][ T0] #PF: error_code(0x0010) - not-present page
> [ 8.486518][ T0] PGD 0 P4D 0
> [ 8.489757][ T0] Oops: 0010 [#1] SMP KASAN PTI
> [ 8.494476][ T0] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G I 5.9.0-rc6-next-20200925 #2
> [ 8.503987][ T0] Hardware name: HPE ProLiant DL560 Gen10/ProLiant DL560 Gen10, BIOS U34 11/13/2019
> [ 8.513238][ T0] RIP: 0010:0x0
> [ 8.516562][ T0] Code: Bad RIP v

Here it looks like this:

[ 1.830276] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 1.838043] #PF: supervisor instruction fetch in kernel mode
[ 1.844357] #PF: error_code(0x0010) - not-present page
[ 1.850090] PGD 0 P4D 0
[ 1.852915] Oops: 0010 [#1] SMP
[ 1.856419] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc6-00700-g0248dedd12d4 #419
[ 1.865447] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[ 1.876902] RIP: 0010:0x0
[ 1.879824] Code: Bad RIP value.
[ 1.883423] RSP: 0000:ffffffff82803da0 EFLAGS: 00010282
[ 1.889251] RAX: 0000000000000000 RBX: ffffffff8282b980 RCX: ffffffff82803e40
[ 1.897241] RDX: 0000000000000001 RSI: ffffffff82803e40 RDI: ffffffff8282b980
[ 1.905201] RBP: ffff88842f331000 R08: 00000000ffffffff R09: 0000000000000001
[ 1.913162] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000048
[ 1.921123] R13: ffffffff82803e40 R14: ffffffff8282b9c0 R15: 0000000000000000
[ 1.929085] FS: 0000000000000000(0000) GS:ffff88842f400000(0000) knlGS:0000000000000000
[ 1.938113] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.944524] CR2: ffffffffffffffd6 CR3: 0000000002811001 CR4: 00000000000606b0
[ 1.952484] Call Trace:
[ 1.955214] msi_domain_alloc+0x36/0x130
[ 1.959594] __irq_domain_alloc_irqs+0x165/0x380
[ 1.964748] dmar_alloc_hwirq+0x9a/0x120
[ 1.969127] dmar_set_interrupt.part.0+0x1c/0x60
[ 1.974281] enable_drhd_fault_handling+0x2c/0x6c
[ 1.979532] apic_intr_mode_init+0xfa/0x100
[ 1.984191] x86_late_time_init+0x20/0x30
[ 1.988662] start_kernel+0x723/0x7e6
[ 1.992748] secondary_startup_64_no_verify+0xa6/0xab
[ 1.998386] Modules linked in:
[ 2.001794] CR2: 0000000000000000
[ 2.005510] ---[ end trace 837dc60d7c66efa2 ]---