Re: [PATCH] perf: Fix oops when kthread execs user process

From: Mark Rutland
Date: Tue May 28 2019 - 12:16:22 EST


On Tue, May 28, 2019 at 04:32:24PM +0100, Will Deacon wrote:
> On Tue, May 28, 2019 at 04:01:03PM +0200, Peter Zijlstra wrote:
> > On Tue, May 28, 2019 at 08:31:29PM +0800, Young Xiao wrote:
> > > When a kthread calls call_usermodehelper() the steps are:
> > > 1. allocate current->mm
> > > 2. load_elf_binary()
> > > 3. populate current->thread.regs
> > >
> > > While doing this, interrupts are not disabled. If there is a perf
> > > interrupt in the middle of this process (i.e. step 1 has completed
> > > but not yet reached to step 3) and if perf tries to read userspace
> > > regs, kernel oops.
>
> This seems to be because pt_regs(current) gives NULL for kthreads on Power.

I think you mean task_pt_regs(current) here.

> > > Fix it by setting abi to PERF_SAMPLE_REGS_ABI_NONE when userspace
> > > pt_regs are not set.
> > >
> > > See commit bf05fc25f268 ("powerpc/perf: Fix oops when kthread execs
> > > user process") for details.
> >
> > Why the hell do we set current->mm before it is complete? Note that
> > normally exec() builds the new mm before attaching it, see exec_mmap()
> > in flush_old_exec().
>
> From the initial report [1], it doesn't look like the mm isn't initialised,
> but rather than we're dereferencing a NULL pt_regs pointer somehow for the
> current task (see previous comment). I don't see how that can happen on
> arm64, given that we put the pt_regs on the kernel stack which is allocated
> during fork.
>
> Will
>
> [1] https://git.kernel.org/linus/bf05fc25f268

One caveat is that for the idle threads, the initial SP overlaps the
task_pt_regs() area:

* __primary_switched starts SP at init_thread_union + THREAD_SIZE.

* __cpu_up() starts SP at task_stack_page(idle) + THREAD_SIZE.

... and in either case, sampling that would be bad.

For both arm, I believe similar holds true. AFAICT x86 seems to reserve
the regs area in its head_{64,32}.S, but I can't see what it does for
other threads.

Regardless, for arm, arm64, and x86, task_pt_regs(current) cannot be
NULL.

Thanks,
Mark.