Re: [PATCH] kfence: Use pt_regs to generate stack trace on faults
From: Marco Elver
Date: Thu Nov 05 2020 - 06:11:33 EST
On Thu, 5 Nov 2020 at 11:52, Mark Rutland <mark.rutland@xxxxxxx> wrote:
> On Thu, Nov 05, 2020 at 10:21:33AM +0100, Marco Elver wrote:
> > Instead of removing the fault handling portion of the stack trace based
> > on the fault handler's name, just use struct pt_regs directly.
> >
> > Change kfence_handle_page_fault() to take a struct pt_regs, and plumb it
> > through to kfence_report_error() for out-of-bounds, use-after-free, or
> > invalid access errors, where pt_regs is used to generate the stack
> > trace.
> >
> > If the kernel is a DEBUG_KERNEL, also show registers for more
> > information.
> >
> > Suggested-by: Mark Rutland <mark.rutland@xxxxxxx>
> > Signed-off-by: Marco Elver <elver@xxxxxxxxxx>
>
> Wow; I wasn't expecting this to be put together so quickly, thanks for
> doing this!
>
> From a scan, this looks good to me -- just one question below.
>
> > diff --git a/include/linux/kfence.h b/include/linux/kfence.h
> > index ed2d48acdafe..98a97f9d43cd 100644
> > --- a/include/linux/kfence.h
> > +++ b/include/linux/kfence.h
> > @@ -171,6 +171,7 @@ static __always_inline __must_check bool kfence_free(void *addr)
> > /**
> > * kfence_handle_page_fault() - perform page fault handling for KFENCE pages
> > * @addr: faulting address
> > + * @regs: current struct pt_regs (can be NULL, but shows full stack trace)
> > *
> > * Return:
> > * * false - address outside KFENCE pool,
>
> > @@ -44,8 +44,12 @@ static int get_stack_skipnr(const unsigned long stack_entries[], int num_entries
> > case KFENCE_ERROR_UAF:
> > case KFENCE_ERROR_OOB:
> > case KFENCE_ERROR_INVALID:
> > - is_access_fault = true;
> > - break;
> > + /*
> > + * kfence_handle_page_fault() may be called with pt_regs
> > + * set to NULL; in that case we'll simply show the full
> > + * stack trace.
> > + */
> > + return 0;
>
> For both the above comments, when/where is kfence_handle_page_fault()
> called with regs set to NULL? I couldn't spot that in this patch, so
> unless I mised it I'm guessing that's somewhere outside of the patch
> context?
Right, currently it's not expected to happen, but I'd like to permit
this function being called not from fault handlers, for use-cases like
this:
https://lkml.kernel.org/r/CANpmjNNxAvembOetv15FfZ=04mpj0Qwx+1tnn22tABaHHRRv=Q@xxxxxxxxxxxxxx
The revised recommendation when trying to get KFENCE to give us more
information about allocation/free stacks after refcount underflow
(like what Paul was trying to do) would be to call
kfence_handle_page_fault(addr, NULL).
> If this is a case we don't expect to happen, maybe add a WARN_ON_ONCE()?
While it's currently not expected, I don't see why we should make this
WARN and limit the potential uses of the API if it works just fine if
we pass regs set to NULL. Although arguably the name
kfence_handle_page_fault() might be confusing for such uses, for now,
until more widespread use is evident (if at all) I'd say let's keep
as-is, but simply not prevent such use-cases.
Thanks,
-- Marco