Re: [PATCH 2/2] tracing/events/lockdep: move tracepoints withinrecursive protection

From: Mathieu Desnoyers
Date: Thu Apr 16 2009 - 14:22:49 EST


* Steven Rostedt (rostedt@xxxxxxxxxxx) wrote:
>
>
> [ added Maitheu, since he likes things like this ]
>
> On Thu, 16 Apr 2009, Peter Zijlstra wrote:
>
> > On Thu, 2009-04-16 at 13:38 -0400, Steven Rostedt wrote:
> >
> > > > > Note, that the ring buffer and events are made to be recursive. That is,
> > > > > it allows one event to trace within another event.
> > > >
> > > > But surely not in the same context. You could do a 4 level recursion
> > > > protection like I did in perf-counter, not allowing recursion in:
> > > >
> > > > nmi, irq, softirq, process - context.
> > >
> > > Why not allow a nested interrupt to trace?
> > >
> > > I don't want to add this logic to the lower levels, where only a few
> > > users need the protection. The protecting should be at the user level.
> >
> > wouldn't you want to disable preemption/softirq/irqs in the tracer -- to
> > avoid such recursion to begin with (preemption isn't even strictly
> > needed if you put the recursion count in the task struct, as each task
> > has a new stack anyway).
>
> No, we only disable preemption, nothing more. Interrupts and softirqs are
> free to happen. Also, we allow tracing of NMIs.
>
> >
> > I think having a recursion detection in place is far more valuable than
> > being able to recursively trace interrupts and the like, which are
> > exceedingly rare (on x86, and power and other arch with multiple
> > interrupt levels that each have their own stack can extend the recursion
> > levels too).
>
> Is there any arch generic way to tell what level you are at?
>
> That is, at thread context, you are at level 0, if an interrupt comes
> in, it sets you to level 1, if another interrupt comes in, it sets you to
> level 2, and so on.
>
> I guess we could add this into the irq_enter/exit sofirq_enter/exit and
> nmi_enter/exit.
>
> Thus we can have each task with a bitmask. When we start to trace, we set
> the bit coresponding to the level the task is at.
>
> Ie. in thread context, we set bit 0, if we are interrupted by a
> softirq/irq/nmi, we set the level bit we are at. Hmm, we might be able to
> do this via the preempt count already :-/
>
> Just add the softirq/irq/nmi bits together.
>
> The if the bit is already set we can dump out a warning.
>
> I'll try that out.
>

I think having a "recursive tracer call" detection is very valuable.
e.g., if the trace code has a bug and causes a page fault, which happens
to be instrumented and to call the tracer again, I would call this a
recursive tracing call. In that case we want to drop the nested event.

However, if the tracer code is interrupted, and the nested tracer code
is simply "nested" without any recursion with the context underneath
(and therefore will never cause infinite recursion), then everyhting
should go smoothly, no need to drop anything.

If the trap handlers do not change the preempt count, then I think your
idea should work. Also don't forget the "in nmi" preempt count bit.
Given that all you want to check is if the preempt count for softirq,
irq and nmis have changed or not compared to the snapshot taken in the
context underneath, I think you just have to remember those bits, no
need to "add" them together. A simple copy, and a bitmask "and" to keep
only nmi, irq, softirq bits in the test should be fine.

Mathieu

>
> >
> > > > That allows you to trace an irq while you're tracing something in
> > > > process context, etc.. But not allow recursion on the same level.
> > > >
> > > > > If the tracepoint is
> > > > > triggered by something within the trace point handler, then we are
> > > > > screwed. That needs to be fixed.
> > > >
> > > > Exactly the thing you want to detect and warn about, preferably with a
> > > > nice stack trace.
> > >
> > > Its hard when you want to allow nesting.
> >
> > Hard never stopped us before, did it ;-)
>
> And it may not be that hard if we do the above.
>
> -- Steve
>
> >
> > > > > I have not seen what is triggering back into locking. The ring buffer and
> > > > > what I can see by the event code, does not grab any locks besides raw
> > > > > ones.
> > > >
> > > > Well, it used to all work, so something snuck in.
> > >
> > > Note, it seems only the lockdep has issues with nesting. Perhaps when I
> > > can publish the lockless ring buffer this will all go away?
> >
> > I doubt it, it shouldn't happen as it stands -- so this patch only hides
> > the real issue.
> >
> >

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/