Re: [PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return

From: Michael Ellerman
Date: Thu Jan 31 2019 - 00:46:17 EST


Nicolai Stange <nstange@xxxxxxx> writes:

> Michael Ellerman <mpe@xxxxxxxxxxxxxx> writes:
>
>> Joe Lawrence <joe.lawrence@xxxxxxxxxx> writes:
>>> From: Nicolai Stange <nstange@xxxxxxx>
>>>
>>> The ppc64 specific implementation of the reliable stacktracer,
>>> save_stack_trace_tsk_reliable(), bails out and reports an "unreliable
>>> trace" whenever it finds an exception frame on the stack. Stack frames
>>> are classified as exception frames if the STACK_FRAME_REGS_MARKER magic,
>>> as written by exception prologues, is found at a particular location.
>>>
>>> However, as observed by Joe Lawrence, it is possible in practice that
>>> non-exception stack frames can alias with prior exception frames and thus,
>>> that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on
>>> the stack. It in turn falsely reports an unreliable stacktrace and blocks
>>> any live patching transition to finish. Said condition lasts until the
>>> stack frame is overwritten/initialized by function call or other means.
>>>
>>> In principle, we could mitigate this by making the exception frame
>>> classification condition in save_stack_trace_tsk_reliable() stronger:
>>> in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into
>>> account that for all exceptions executing on the kernel stack
>>> - their stack frames's backlink pointers always match what is saved
>>> in their pt_regs instance's ->gpr[1] slot and that
>>> - their exception frame size equals STACK_INT_FRAME_SIZE, a value
>>> uncommonly large for non-exception frames.
>>>
>>> However, while these are currently true, relying on them would make the
>>> reliable stacktrace implementation more sensitive towards future changes in
>>> the exception entry code. Note that false negatives, i.e. not detecting
>>> exception frames, would silently break the live patching consistency model.
>>>
>>> Furthermore, certain other places (diagnostic stacktraces, perf, xmon)
>>> rely on STACK_FRAME_REGS_MARKER as well.
>>>
>>> Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER
>>> for those exceptions running on the "normal" kernel stack and returning
>>> to kernelspace: because the topmost frame is ignored by the reliable stack
>>> tracer anyway, returns to userspace don't need to take care of clearing
>>> the marker.
>>>
>>> Furthermore, as I don't have the ability to test this on Book 3E or
>>> 32 bits, limit the change to Book 3S and 64 bits.
>>>
>>> Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on
>>> PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended
>>> on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies
>>> PPC_BOOK3S_64, there's no functional change here.
>>
>> That has nothing to do with the fix and should really be in a separate
>> patch.
>>
>> I can split it when applying.
>
> If you don't mind, that would be nice! Or simply drop that
> chunk... Otherwise, let me know if I shall send a split v2 for this
> patch [1/4] only.

No worries, I split it out:

commit a50d3250d7ae34c561177a1f9cfb79816fcbcff1
Author: Nicolai Stange <nstange@xxxxxxx>
AuthorDate: Thu Jan 31 16:41:50 2019 +1100
Commit: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
CommitDate: Thu Jan 31 16:43:29 2019 +1100

powerpc/64s: Make reliable stacktrace dependency clearer

Make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on
PPC_BOOK3S_64 for documentation purposes. Before this patch, it
depended on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN
implies PPC_BOOK3S_64, there's no functional change here.

Signed-off-by: Nicolai Stange <nstange@xxxxxxx>
Signed-off-by: Joe Lawrence <joe.lawrence@xxxxxxxxxx>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx>

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 2890d36eb531..73bf87b1d274 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -220,7 +220,7 @@ config PPC
select HAVE_PERF_USER_STACK_DUMP
select HAVE_RCU_TABLE_FREE if SMP
select HAVE_REGS_AND_STACK_ACCESS_API
- select HAVE_RELIABLE_STACKTRACE if PPC64 && CPU_LITTLE_ENDIAN
+ select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && CPU_LITTLE_ENDIAN
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_VIRT_CPU_ACCOUNTING
select HAVE_IRQ_TIME_ACCOUNTING


cheers