Re: [PATCH] /dev/mem: Bail out upon SIGKILL when reading memory.

From: Linus Torvalds
Date: Thu Aug 22 2019 - 18:08:26 EST


On Tue, Aug 20, 2019 at 3:07 PM Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>
> syzbot found that a thread can stall for minutes inside read_mem()
> after that thread was killed by SIGKILL [1]. Reading 2GB at one read()
> is legal, but delaying termination of killed thread for minutes is bad.

Side note: we might even just allow regular signals to interrupt
/dev/mem reads. We already do that for /dev/zero, and the risk of
breaking something is likely fairly low since nothing should use that
thing anyway.

Also, if it takes minutes to delay killing things, that implies that
we're probably still faulting in pages for the read_mem(). Which
points to another possible thing we could do in general: just don't
bother to handle page faults when a fatal signal is pending.

That situation might happen for other random cases too, and is not
limited to /dev/mem. So maybe it's worth trying? Does that essentially
fix the /dev/mem read case too in practice?

COMPLETELY untested patch attached, it may or may not make a
difference (and it may or may not work at all ;)

Linus
arch/x86/mm/fault.c | 15 ++++++++++++---
mm/memory.c | 5 +++++
2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 9ceacd1156db..d6c029a6cb90 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1033,8 +1033,15 @@ static noinline void
mm_fault_error(struct pt_regs *regs, unsigned long error_code,
unsigned long address, vm_fault_t fault)
{
- if (fatal_signal_pending(current) && !(error_code & X86_PF_USER)) {
- no_context(regs, error_code, address, 0, 0);
+ /*
+ * If we already have a fatal signal, don't bother adding
+ * a new one. If it's a kernel access, just make it fail,
+ * and if it's a user access just return to let the process
+ * die.
+ */
+ if (fatal_signal_pending(current)) {
+ if (!(error_code & X86_PF_USER))
+ no_context(regs, error_code, address, 0, 0);
return;
}

@@ -1389,7 +1396,8 @@ void do_user_addr_fault(struct pt_regs *regs,
return;
}
retry:
- down_read(&mm->mmap_sem);
+ if (down_read_killable(&mm->mmap_sem))
+ goto fatal_signal;
} else {
/*
* The above down_read_trylock() might have succeeded in
@@ -1455,6 +1463,7 @@ void do_user_addr_fault(struct pt_regs *regs,
goto retry;
}

+fatal_signal:
/* User mode? Just return to handle the fatal exception */
if (flags & FAULT_FLAG_USER)
return;
diff --git a/mm/memory.c b/mm/memory.c
index e2bb51b6242e..7ad62f96b08e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3988,6 +3988,11 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
flags & FAULT_FLAG_REMOTE))
return VM_FAULT_SIGSEGV;

+ if (flags & FAULT_FLAG_KILLABLE) {
+ if (fatal_signal_pending(current))
+ return VM_FAULT_SIGSEGV;
+ }
+
/*
* Enable the memcg OOM handling for faults triggered in user
* space. Kernel faults are handled more gracefully.