[PATCH 5.14 072/168] powerpc/mce: Fix access error in mce handler

From: Greg Kroah-Hartman
Date: Mon Sep 20 2021 - 14:44:15 EST


From: Ganesh Goudar <ganeshgr@xxxxxxxxxxxxx>

commit 3a1e92d0896e928ac2a5b58962d05a39afef2e23 upstream.

We queue an irq work for deferred processing of mce event in realmode
mce handler, where translation is disabled. Queuing of the work may
result in accessing memory outside RMO region, such access needs the
translation to be enabled for an LPAR running with hash mmu else the
kernel crashes.

After enabling translation in mce_handle_error() we used to leave it
enabled to avoid crashing here, but now with the commit
74c3354bc1d89 ("powerpc/pseries/mce: restore msr before returning from
handler") we are restoring the MSR to disable translation.

Hence to fix this enable the translation before queuing the work.

Without this change following trace is seen on injecting SLB multihit in
an LPAR running with hash mmu.

Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
CPU: 5 PID: 1883 Comm: insmod Tainted: G OE 5.14.0-mce+ #137
NIP: c000000000735d60 LR: c000000000318640 CTR: 0000000000000000
REGS: c00000001ebff9a0 TRAP: 0300 Tainted: G OE (5.14.0-mce+)
MSR: 8000000000001003 <SF,ME,RI,LE> CR: 28008228 XER: 00000001
CFAR: c00000000031863c DAR: c00000027fa8fe08 DSISR: 40000000 IRQMASK: 0
...
NIP llist_add_batch+0x0/0x40
LR __irq_work_queue_local+0x70/0xc0
Call Trace:
0xc00000001ebffc0c (unreliable)
irq_work_queue+0x40/0x70
machine_check_queue_event+0xbc/0xd0
machine_check_early_common+0x16c/0x1f4

Fixes: 74c3354bc1d89 ("powerpc/pseries/mce: restore msr before returning from handler")
Signed-off-by: Ganesh Goudar <ganeshgr@xxxxxxxxxxxxx>
[mpe: Fix comment formatting, trim oops in change log for readability]
Signed-off-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
Link: https://lore.kernel.org/r/20210909064330.312432-1-ganeshgr@xxxxxxxxxxxxx
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
arch/powerpc/kernel/mce.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)

--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -249,6 +249,7 @@ void machine_check_queue_event(void)
{
int index;
struct machine_check_event evt;
+ unsigned long msr;

if (!get_mce_event(&evt, MCE_EVENT_RELEASE))
return;
@@ -262,8 +263,20 @@ void machine_check_queue_event(void)
memcpy(&local_paca->mce_info->mce_event_queue[index],
&evt, sizeof(evt));

- /* Queue irq work to process this event later. */
- irq_work_queue(&mce_event_process_work);
+ /*
+ * Queue irq work to process this event later. Before
+ * queuing the work enable translation for non radix LPAR,
+ * as irq_work_queue may try to access memory outside RMO
+ * region.
+ */
+ if (!radix_enabled() && firmware_has_feature(FW_FEATURE_LPAR)) {
+ msr = mfmsr();
+ mtmsr(msr | MSR_IR | MSR_DR);
+ irq_work_queue(&mce_event_process_work);
+ mtmsr(msr);
+ } else {
+ irq_work_queue(&mce_event_process_work);
+ }
}

void mce_common_process_ue(struct pt_regs *regs,