Re: [PATCH v6 2/2] Output stall data in debugfs

From: Ingo Molnar
Date: Fri Aug 12 2011 - 05:07:25 EST



* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Wed, 2011-08-10 at 11:02 -0700, Alex Neronskiy wrote:
> > @@ -210,22 +236,27 @@ void touch_softlockup_watchdog_sync(void)
> > /* watchdog detector functions */
> > static void update_hardstall(unsigned long stall, int this_cpu)
> > {
> > if (stall > hardstall_thresh && stall > worst_hardstall) {
> > unsigned long flags;
> > + spin_lock_irqsave(&hardstall_write_lock, flags);
> > + if (stall > worst_hardstall) {
> > + int write_ind = hard_read_ind;
> > + int locked = spin_trylock(&hardstall_locks[write_ind]);
> > + /* cannot wait, so if there's contention,
> > + * switch buffers */
> > + if (!locked)
> > + write_ind = !write_ind;
> > +
> > worst_hardstall = stall;
> > + hardstall_traces[write_ind].nr_entries = 0;
> > + save_stack_trace(&hardstall_traces[write_ind]);
> >
> > + /* tell readers to use the new buffer from now on */
> > + hard_read_ind = write_ind;
> > + if (locked)
> > + spin_unlock(&hardstall_locks[write_ind]);
> > + }
> > + spin_unlock_irqrestore(&hardstall_write_lock, flags);
> > }
> > }
>
> That must be the most convoluted locking I've seen in a while.. OMG!

Well, but there are conceptual problems at the higher levels: the
concept of recording a worst-case (or best-case) latency is not
limited to the comparatively minor usecase of soft-watchdog stalls.

We have numerous tracers in ftrace that output their own kinds of
min/max latencies, with associated stack trace signatures.

So the right approach would *not* be to add yet another
special-purpose debugfs variant for this, but to integrate this
capability into perf tracing. That way it would be useful for:

- soft stalls
- irq service latencies
- irq disable latencies
- preempt disable latencies
- wakeup latencies
- and much more: it could be used for just about any event that
measures some sort of latency.

To implement it i'd first suggest to add a TRACE_EVENT() for the
softwatchdog latencies, and then look at how a stack-trace attached
to the worst-case latency could be emitted via the perf ring-buffer.

We do something very, very similar for callchains already, so all the
low level machinery is already there.

Alex, would you be interested in taking a stab at this approach? Such
an approach looks a *lot* more palatable from an upstream merge point
of view and it would give you all the functionality that the current
patches are providing you (and more).

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/