Re: [PATCH] kfence: Use pt_regs to generate stack trace on faults

From: Mark Rutland
Date: Thu Nov 05 2020 - 06:16:00 EST


On Thu, Nov 05, 2020 at 12:11:19PM +0100, Marco Elver wrote:
> On Thu, 5 Nov 2020 at 11:52, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> > On Thu, Nov 05, 2020 at 10:21:33AM +0100, Marco Elver wrote:
> > > Instead of removing the fault handling portion of the stack trace based
> > > on the fault handler's name, just use struct pt_regs directly.
> > >
> > > Change kfence_handle_page_fault() to take a struct pt_regs, and plumb it
> > > through to kfence_report_error() for out-of-bounds, use-after-free, or
> > > invalid access errors, where pt_regs is used to generate the stack
> > > trace.
> > >
> > > If the kernel is a DEBUG_KERNEL, also show registers for more
> > > information.
> > >
> > > Suggested-by: Mark Rutland <mark.rutland@xxxxxxx>
> > > Signed-off-by: Marco Elver <elver@xxxxxxxxxx>
> >
> > Wow; I wasn't expecting this to be put together so quickly, thanks for
> > doing this!
> >
> > From a scan, this looks good to me -- just one question below.
> >
> > > diff --git a/include/linux/kfence.h b/include/linux/kfence.h
> > > index ed2d48acdafe..98a97f9d43cd 100644
> > > --- a/include/linux/kfence.h
> > > +++ b/include/linux/kfence.h
> > > @@ -171,6 +171,7 @@ static __always_inline __must_check bool kfence_free(void *addr)
> > > /**
> > > * kfence_handle_page_fault() - perform page fault handling for KFENCE pages
> > > * @addr: faulting address
> > > + * @regs: current struct pt_regs (can be NULL, but shows full stack trace)
> > > *
> > > * Return:
> > > * * false - address outside KFENCE pool,
> >
> > > @@ -44,8 +44,12 @@ static int get_stack_skipnr(const unsigned long stack_entries[], int num_entries
> > > case KFENCE_ERROR_UAF:
> > > case KFENCE_ERROR_OOB:
> > > case KFENCE_ERROR_INVALID:
> > > - is_access_fault = true;
> > > - break;
> > > + /*
> > > + * kfence_handle_page_fault() may be called with pt_regs
> > > + * set to NULL; in that case we'll simply show the full
> > > + * stack trace.
> > > + */
> > > + return 0;
> >
> > For both the above comments, when/where is kfence_handle_page_fault()
> > called with regs set to NULL? I couldn't spot that in this patch, so
> > unless I mised it I'm guessing that's somewhere outside of the patch
> > context?
>
> Right, currently it's not expected to happen, but I'd like to permit
> this function being called not from fault handlers, for use-cases like
> this:
>
> https://lkml.kernel.org/r/CANpmjNNxAvembOetv15FfZ=04mpj0Qwx+1tnn22tABaHHRRv=Q@xxxxxxxxxxxxxx
>
> The revised recommendation when trying to get KFENCE to give us more
> information about allocation/free stacks after refcount underflow
> (like what Paul was trying to do) would be to call
> kfence_handle_page_fault(addr, NULL).
>
> > If this is a case we don't expect to happen, maybe add a WARN_ON_ONCE()?
>
> While it's currently not expected, I don't see why we should make this
> WARN and limit the potential uses of the API if it works just fine if
> we pass regs set to NULL. Although arguably the name
> kfence_handle_page_fault() might be confusing for such uses, for now,
> until more widespread use is evident (if at all) I'd say let's keep
> as-is, but simply not prevent such use-cases.

Fair enough! I guess in future we could always revise that anyhow.

FWIW, for this as-is:

Acked-by: Mark Rutland <mark.rutland@xxxxxxx>

Mark.