Re: [PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return

From: Nicolai Stange
Date: Wed Jan 30 2019 - 12:18:11 EST


Michael Ellerman <mpe@xxxxxxxxxxxxxx> writes:

> Joe Lawrence <joe.lawrence@xxxxxxxxxx> writes:
>> From: Nicolai Stange <nstange@xxxxxxx>
>>
>> The ppc64 specific implementation of the reliable stacktracer,
>> save_stack_trace_tsk_reliable(), bails out and reports an "unreliable
>> trace" whenever it finds an exception frame on the stack. Stack frames
>> are classified as exception frames if the STACK_FRAME_REGS_MARKER magic,
>> as written by exception prologues, is found at a particular location.
>>
>> However, as observed by Joe Lawrence, it is possible in practice that
>> non-exception stack frames can alias with prior exception frames and thus,
>> that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on
>> the stack. It in turn falsely reports an unreliable stacktrace and blocks
>> any live patching transition to finish. Said condition lasts until the
>> stack frame is overwritten/initialized by function call or other means.
>>
>> In principle, we could mitigate this by making the exception frame
>> classification condition in save_stack_trace_tsk_reliable() stronger:
>> in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into
>> account that for all exceptions executing on the kernel stack
>> - their stack frames's backlink pointers always match what is saved
>> in their pt_regs instance's ->gpr[1] slot and that
>> - their exception frame size equals STACK_INT_FRAME_SIZE, a value
>> uncommonly large for non-exception frames.
>>
>> However, while these are currently true, relying on them would make the
>> reliable stacktrace implementation more sensitive towards future changes in
>> the exception entry code. Note that false negatives, i.e. not detecting
>> exception frames, would silently break the live patching consistency model.
>>
>> Furthermore, certain other places (diagnostic stacktraces, perf, xmon)
>> rely on STACK_FRAME_REGS_MARKER as well.
>>
>> Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER
>> for those exceptions running on the "normal" kernel stack and returning
>> to kernelspace: because the topmost frame is ignored by the reliable stack
>> tracer anyway, returns to userspace don't need to take care of clearing
>> the marker.
>>
>> Furthermore, as I don't have the ability to test this on Book 3E or
>> 32 bits, limit the change to Book 3S and 64 bits.
>>
>> Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on
>> PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended
>> on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies
>> PPC_BOOK3S_64, there's no functional change here.
>
> That has nothing to do with the fix and should really be in a separate
> patch.
>
> I can split it when applying.

If you don't mind, that would be nice! Or simply drop that
chunk... Otherwise, let me know if I shall send a split v2 for this
patch [1/4] only.

Thanks,

Nicolai

--
SUSE Linux GmbH, GF: Felix ImendÃrffer, Jane Smithard, Graham Norton,
HRB 21284 (AG NÃrnberg)