[RFC PATCH] Fix EFI callbacks on UV during kexec

From: Alex Thorlton
Date: Tue Jul 26 2016 - 18:46:14 EST


Hey everyone,

This is a fix for our BIOS init code to skip mapping in runtime services
when runtime_disabled == true. This is one that snuck under the radar
for a while, since we were using EFI_OLD_MEMMAP for so long. I've
explained the details of how it went unnoticed in the commit message.

After investigating the problem here and figuring out the proper way to
get the noefi parameter working again, I noticed that there appears to
be support for EFI runtime callbacks in a kexec'd kernel now... I
think we need some more cleanup here to get that all working entirely.
Without noefi, we hit a bad paging request when we try to do EFI
callbacks:

[ 0.292926] UV: UVsystab: Revision:1
[ 0.296913] UV: No UVsystab socket table, ignoring
[ 0.302261] UV: N:4 M:36 m_shift:28 n_lshift:39
[ 0.307317] UV: gpa_mask/shift:0xffffffffff/0 pnode_mask:0xf apic_pns:5
[ 0.314697] UV: mmr_base/shift:0xff40000000/26 gru_base/shift:0x0/0
[ 0.321692] UV: gnode_upper:0x0 gnode_extra:0x0
[ 0.326746] UV: NODE_PRESENT_DEPTH = 16
[ 0.331025] UV: NODE_PRESENT(0) = 0x0000000000000001
[ 0.336569] UV: Found 1 hubs, 1 nodes, 10 cpus
[ 0.341531] BUG: unable to handle kernel paging request at 000000006a1ab938
[ 0.349319] IP: [<000000006a1ab938>] 0x6a1ab938
[ 0.354386] PGD 354e0063 PUD 0
[ 0.357910] Oops: 0010 [#1] SMP
[ 0.361414] Modules linked in:
[ 0.364833] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-runtime-check+ #713
[ 0.372988] Hardware name: SGI UV3000/UV3000, BIOS SGI UV 3000 series BIOS 01/15/2015
[ 0.381725] task: ffff880035614040 ti: ffff880035618000 task.ti: ffff880035618000
[ 0.390075] RIP: 0010:[<000000006a1ab938>] [<000000006a1ab938>] 0x6a1ab938
[ 0.397855] RSP: 0000:ffff88003561bbe8 EFLAGS: 00010086
[ 0.403780] RAX: 0000000000000000 RBX: ffffc90000006000 RCX: 0000000000000001
[ 0.411741] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 000000006a1ab938
[ 0.419705] RBP: ffff88003561bc90 R08: ffff88003561bd10 R09: ffff88003561bd18
[ 0.427667] R10: ffff8800354d4000 R11: 00000000000000c9 R12: 0000000000000000
[ 0.435630] R13: 0000000000000000 R14: ffff88003561bd18 R15: 0000000000000001
[ 0.443592] FS: 0000000000000000(0000) GS:ffff880034800000(0000) knlGS:0000000000000000
[ 0.452621] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.459033] CR2: 000000006a1ab938 CR3: 00000000354d3000 CR4: 00000000001406f0
[ 0.466996] Stack:
[ 0.469236] ffffffff8105e148 0000000000000046 0000000000000096 0000000000000096
[ 0.477532] ffff88003561bc28 0000000000000000 0000000000000000 ffff88003561bc90
[ 0.485826] 0000000080050033 0000000000000000 0000000000000000 0000000000000000
[ 0.494121] Call Trace:
[ 0.496851] [<ffffffff8105e148>] ? efi_call+0x58/0x90
[ 0.502586] [<ffffffff81061812>] uv_bios_call+0x82/0x120
[ 0.508609] [<ffffffff81061930>] uv_bios_call_irqsave+0x20/0x40
[ 0.515310] [<ffffffff81061990>] uv_bios_get_sn_info+0x40/0xb0
[ 0.521921] [<ffffffff81b76ed4>] uv_system_init+0x8b6/0x143e
[ 0.528337] [<ffffffff810c1105>] ? vprintk_emit+0x225/0x470
[ 0.534645] [<ffffffff81b71556>] native_smp_prepare_cpus+0x299/0x2e4
[ 0.541836] [<ffffffff81b62197>] kernel_init_freeable+0xc3/0x220
[ 0.548638] [<ffffffff815c9cce>] kernel_init+0xe/0x110
[ 0.554467] [<ffffffff815d5abf>] ret_from_fork+0x1f/0x40
[ 0.560491] [<ffffffff815c9cc0>] ? rest_init+0x80/0x80
[ 0.566320] Code: Bad RIP value.
[ 0.570035] RIP [<000000006a1ab938>] 0x6a1ab938
[ 0.575197] RSP <ffff88003561bbe8>
[ 0.579087] CR2: 000000006a1ab938
[ 0.582786] ---[ end trace 99fd1a588f7287b9 ]---

This is due to the fact that the efi_map_region_fixed calls in
kexec_enter_virtual_mode, which map in the EFI runtime memory
descriptors, only map the virtual address of the descriptor.
Unfortunately, since we're still relying on the physical address of our
EFI runtime code being mapped in, we don't have access to that code in
the kexec scenario.

A potential fix for this would be to map in the physical addresses of
the descriptors as well as the virtual addresses in
efi_map_region_fixed, but the more "correct" fix would be to update
our system table pointer to its new virtual address during
SetVirtualAddressMap. We intend to get that piece fixed up relatively
soon, but haven't quite gotten around to it yet.

Let me know what you guys think!

Alex Thorlton (1):
Skip UV runtime services mapping in the efi_runtime_disabled case

arch/x86/platform/uv/bios_uv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--
1.8.5.6