Re: perf: [patch] regression with PERF_EVENT_IOC_REFRESH
From: Peter Zijlstra
Date: Tue May 31 2011 - 18:29:19 EST
On Tue, 2011-05-31 at 09:49 -0400, Vince Weaver wrote:
> On Tue, 31 May 2011, Peter Zijlstra wrote:
>
> > On Mon, 2011-05-30 at 21:33 -0400, Vince Weaver wrote:
> > > the problem was the mentioned commit tried to optimize the use of
> > > watermark and wakeup_watermark without taking into account that
> > > wakeup_watermark is a union with wakeup_events.
> >
> > Note that wake_events isn't related to IOC_REFRESH, wake_events is how
> > much events to buffer in the mmap-buffer before issuing a wakeup.
> >
> > IOC_REFRESH increments event_limit, which is how many events to run
> > before disabling yourself.
> >
> > What I gather is that due to that SIGIO bug (fixed by f506b3dc0e), you
> > had to have both an mmap and a wakeup in order for that signal to
> > arrive.
>
> yes, but due to a bug in the mentioned changeset, the buffer watermark
> value was being set to a low value even if *watermark* was 0. So if you
> were using IOC_REFRESH to set the *wakeup_events* value,
IOC_REFRESH sets event->event_limit, not wakeup_events.
> it was also
> setting the *wakeup_watermark* value (it's a union) and the buffer setup
> was then unconditionally setting the buffer watermark to the value of the
> supposedly unrelated *wakeup_watermark*. Normally the wakeup watermark
> would default to something like 2048, but if you were trying to set the
> wakeup_events value to something like 3 then wakeup_watermark would be set
> to that too, causing a lot more overflow events.
poll() wakeups, which were inadvertly linked to SIGIOs
> I verified all the above painfully using a lot of printks.
I prefer to use trace_printk() and /debug/tracing/, that doesn't slow
stuff down as much.
> I agree this does seem to be a combination of bugs, as even with a
> properlyu set value on affected kernels you'd get spurious watermark
> overflow events if you weren't consuming the ring buffer.
*nod*
> In any case, I can provide a cleaner patch than the one before that isn't
> as intrusive.
Appreciated.
> I'm also bisecting the other problem I mentioned, the one where overflows
> are 10x too large on 3.0-rc1. I'm at work with a Nehalem machine so the
> bisect should go faster than the bisect I had to do on an atom machine
> this weekend.
It wouldn't be the SIGIO fix would it?, with that every overflow
generates a SIGIO, not only the poll() wakeups. And ouch at bisecting
(or even building a kernel) on an Atom, those things are horridly slow.
> A power outage over the weekend has taken part of the
> network down here though so my e-mail access is a bit limited, so I
> apologize if I've been missing comments sent to my other e-mail address.
I'm afraid not, I've been mostly tied up with fixing some scheduler
regressions.
Also, it looks like I just broke stuff even worse in -tip, am bisecting
that now.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/