Re: [PATCH] perf/x86: Fix overlap counter scheduling bug
From: Peter Zijlstra
Date: Tue Nov 08 2016 - 11:57:55 EST
On Tue, Nov 08, 2016 at 04:22:13PM +0000, Liang, Kan wrote:
>
>
> > >
> > >
> > > diff --git a/arch/x86/events/intel/uncore_snbep.c
> > > b/arch/x86/events/intel/uncore_snbep.c
> > > index 272427700d48..71bc348736bd 100644
> > > --- a/arch/x86/events/intel/uncore_snbep.c
> > > +++ b/arch/x86/events/intel/uncore_snbep.c
> > > @@ -669,7 +669,7 @@ static struct event_constraint
> > snbep_uncore_cbox_constraints[] = {
> > > UNCORE_EVENT_CONSTRAINT(0x1c, 0xc),
> > > UNCORE_EVENT_CONSTRAINT(0x1d, 0xc),
> > > UNCORE_EVENT_CONSTRAINT(0x1e, 0xc),
> > > - EVENT_CONSTRAINT_OVERLAP(0x1f, 0xe, 0xff),
> > > + UNCORE_EVENT_CONSTRAINT(0x1f, 0xc); /* should be 0x0e but that
> > gives
> > > +scheduling pain */
>
> I think the crash is caused by the overlap bit.
> Why not just revert the previous patch?
>
> If overlap bit is removed, the perf_sched_save_state will never be touched.
> Why we have to reduce a counter?
By simply removing the overlap bit you'll still get bad scheduling
(we'll just not crash).
I think all the 0x3 mask need the overlap flag set, since they clearly
overlap with the 0x1 masks. That would improve the scheduling.
But as Jiri noted, you cannot do 0x1 + 0x3 + 0xc + 0xe without also
raising the retry limit, because that are 4 overlapping masks, you'll
have to, worst case, pop 3 attempts.
By reducing 0xe to 0xc you'll not have 4 overlapping masks anymore.
In any case, overlapping masks stink (because they make scheduling
O(n!)) and ideally hardware would not do this.