Re: [PATCH] perf/x86: Consider pinned events for group validation
From: Peter Zijlstra
Date: Tue Aug 20 2019 - 11:09:55 EST
On Tue, Aug 20, 2019 at 10:52:57AM -0400, Liang, Kan wrote:
> On 8/20/2019 10:10 AM, Peter Zijlstra wrote:
> > On Fri, Aug 16, 2019 at 10:49:10AM -0700, kan.liang@xxxxxxxxxxxxxxx wrote:
> > > From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
> > >
> > > perf stat -M metrics relies on weak groups to reject unschedulable
> > > groups and run them as non-groups.
> > > This uses the group validation code in the kernel. Unfortunately
> > > that code doesn't take pinned events, such as the NMI watchdog, into
> > > account. So some groups can pass validation, but then later still
> > > never schedule.
> > But if you first create the group and then a pinned event it 'works',
> > which is inconsistent and makes all this timing dependent.
> I don't think so. The pinned event will be validated by validate_event(),
> which doesn't simulate the schedule.
> So the validation still pass, but the group still never schedule.
Exactly my point; so sometimes it fails creation and sometimes if fails
running. So now we have two failiure cases instead of one and the
reason might not always be evident.
> > > + /*
> > > + * The new group must can be scheduled
> > > + * together with current pinned events.
> > > + * Otherwise, it will never get a chance
> > > + * to be scheduled later.
> > That's wrapped short; also I don't think it is sufficient; what if you
> > happen to have a pinned event on CPU1 (and not others) and happen to run
> > validation for a new CPU1 event on CPUn ?
> The patch doesn't support this case.
Which makes the whole thing even more random.
> It is mentioned in the description.
> The patch doesn't intend to catch all possible cases that cannot be
> scheduled. I think it's impossible to catch all cases.
> We only want to improve the validate_group() a little bit to catch some
> common cases, e.g. NMI watchdog interacting with group.
> > Also; per that same; it is broken, you're accessing the cpu-local cpuc
> > without serialization.
> Do you mean accessing all cpuc serially?
> We only check the cpuc on current CPU here. It doesn't intend to access
> other cpuc.
There's nothing preventing the cpuc you're looking at changing while
you're looking at it. Heck, afaict it is possible to UaF here. Nothing
prevents the events you're looking at from going away and getting freed.