Re: [PATCH V8 1/4] perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE

From: Stephane Eranian
Date: Wed Sep 30 2020 - 18:45:28 EST


On Wed, Sep 30, 2020 at 10:30 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Wed, Sep 30, 2020 at 07:48:48AM -0700, Dave Hansen wrote:
> > On 9/30/20 7:42 AM, Liang, Kan wrote:
> > >> When I tested on my kernel, it panicked because I suspect
> > >> current->active_mm could be NULL. Adding a check for NULL avoided the
> > >> problem. But I suspect this is not the correct solution.
> > >
> > > I guess the NULL active_mm should be a rare case. If so, I think it's
> > > not bad to add a check and return 0 page size.
> >
> > I think it would be best to understand why ->active_mm is NULL instead
> > of just papering over the problem. If it is papered over, and this is
> > common, you might end up effectively turning off your shiny new feature
> > inadvertently.
>
> context_switch() can set prev->active_mm to NULL when it transfers it to
> @next. It does this before @current is updated. So an NMI that comes in
> between this active_mm swizzling and updating @current will see
> !active_mm.
>
I think Peter is right. This code is called in the context of NMI, so
if active_mm is set to NULL inside
a locked section, this is not enough to protect the perf_events code
from seeing it.

> In general though; I think using ->active_mm is a mistake though. That
> code should be doing something like:
>
>
> mm = current->mm;
> if (!mm)
> mm = &init_mm;
>
>