Re: [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message

From: Thomas Gleixner
Date: Mon Nov 16 2020 - 17:01:09 EST


On Tue, Nov 03 2020 at 19:20, Borislav Petkov wrote:
> On Tue, Nov 03, 2020 at 07:11:15PM +0100, Oleg Nesterov wrote:
>> > I'm thinking copy_code() should not use copy_from_user_nmi() if former
>> > can be called in non-atomic context too.

While copy_from_user_nmi() is named that way, it can be invoked from
other contexts as well. See the comment inside.

>> I understand, but why do you think this makes sense?
>
> Because the copy_from_user_nmi()'s name tells me that it is at least
> supposed to be called in atomic context. At least this is how I
> understand it. And in atomic context regs is supposed to belong to
> current, right?

Whatever context you are in current can only read it's own user space
obviously.

AFAICT even before the change I did, show_opcodes() did not care and
just either dumped what was available at regs->ip in the current tasks
user space mapping or faulted.

Fix below.

Thanks,

tglx
---
Subject: x86/dumpstack: Don't try to access user space code of other tasks
From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Date: Mon, 16 Nov 2020 22:26:52 +0100

sysrq-t ends up invoking show_opcodes() for each task which tries to access
the user space code of other processes which is obviously bogus.

It either manages to dump where the foreign tasks regs->ip points to in
currents mapping or triggers a pagefault and prints "Code: Bad RIP
value.". Both is just wrong.

Add a safeguard in copy_code() and check whether the @regs pointer matches
currents pt_regs. If not, do not even try to access it.

While at it, add commentry why using copy_from_user_nmi() is safe in
copy_code() even if the function name suggests otherwise.

Reported-by: Mark Mossberg <mark.mossberg@xxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
---
arch/x86/kernel/dumpstack.c | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)

--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -78,6 +78,9 @@ static int copy_code(struct pt_regs *reg
if (!user_mode(regs))
return copy_from_kernel_nofault(buf, (u8 *)src, nbytes);

+ /* The user space code from other tasks cannot be accessed. */
+ if (regs != task_pt_regs(current))
+ return -EPERM;
/*
* Make sure userspace isn't trying to trick us into dumping kernel
* memory by pointing the userspace instruction pointer at it.
@@ -85,6 +88,12 @@ static int copy_code(struct pt_regs *reg
if (__chk_range_not_ok(src, nbytes, TASK_SIZE_MAX))
return -EINVAL;

+ /*
+ * Even if named copy_from_user_nmi() this can be invoked from
+ * other contexts and will not try to resolve a pagefault, which is
+ * the correct thing to do here as this code can be called from any
+ * context.
+ */
return copy_from_user_nmi(buf, (void __user *)src, nbytes);
}

@@ -115,13 +124,19 @@ void show_opcodes(struct pt_regs *regs,
u8 opcodes[OPCODE_BUFSIZE];
unsigned long prologue = regs->ip - PROLOGUE_SIZE;

- if (copy_code(regs, opcodes, prologue, sizeof(opcodes))) {
- printk("%sCode: Unable to access opcode bytes at RIP 0x%lx.\n",
- loglvl, prologue);
- } else {
+ switch (copy_code(regs, opcodes, prologue, sizeof(opcodes))) {
+ case 0:
printk("%sCode: %" __stringify(PROLOGUE_SIZE) "ph <%02x> %"
__stringify(EPILOGUE_SIZE) "ph\n", loglvl, opcodes,
opcodes[PROLOGUE_SIZE], opcodes + PROLOGUE_SIZE + 1);
+ break;
+ case -EPERM:
+ /* No access to the user space stack of other tasks. Ignore. */
+ break;
+ default:
+ printk("%sCode: Unable to access opcode bytes at RIP 0x%lx.\n",
+ loglvl, prologue);
+ break;
}
}