On Sat, Aug 19, 2023 at 06:22:12PM +0800, Shuai Xue wrote:
When a process tries to access a page that is already offline
How does this happen?
the kernel will send a sigbus signal with the BUS_MCEERR_AR code. This
signal is typically handled by a registered sigbus handler in the
process. However, if the process does not have a registered sigbus
handler, it is important for end users to be informed about what
happened.
To address this, add an error message similar to those implemented on
the x86, powerpc, and parisc platforms.
Signed-off-by: Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx>
---
arch/arm64/mm/fault.c | 2 ++
arch/parisc/mm/fault.c | 5 ++---
arch/x86/mm/fault.c | 3 +--
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 3fe516b32577..38e2186882bd 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -679,6 +679,8 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
} else if (fault & (VM_FAULT_HWPOISON_LARGE | VM_FAULT_HWPOISON)) {
unsigned int lsb;
+ pr_err("MCE: Killing %s:%d due to hardware memory corruption fault at %lx\n",
+ current->comm, current->pid, far);
lsb = PAGE_SHIFT;
if (fault & VM_FAULT_HWPOISON_LARGE)
lsb = hstate_index_to_shift(VM_FAULT_GET_HINDEX(fault));
Hmm, I'm not convinced by this. We have 'show_unhandled_signals' already,
and there's plenty of code in memory-failure.c for handling poisoned pages
reported by e.g. GHES. I don't think dumping extra messages in dmesg from
the arch code really adds anything.