Re: [tip:perfcounters/core] perf_counter: x86: Fix call-chainsupport to use NMI-safe methods

From: Mathieu Desnoyers
Date: Tue Jun 16 2009 - 17:14:33 EST

* H. Peter Anvin (hpa@xxxxxxxxx) wrote:
> Mathieu Desnoyers wrote:
> > * H. Peter Anvin (hpa@xxxxxxxxx) wrote:
> >> Mathieu Desnoyers wrote:
> >>> With respect to cr2, yes, this is the only window we care about.
> >>> However, the rest of vmalloc_fault() must be audited for other non
> >>> nmi-suitable data structure use (e.g. "current"), which I did in the
> >>> past.
> >>>
> >>> My intent was just to respond to Peter's concerns by showing that the
> >>> part of page fault handler which needs to be NMI-reentrant is really not
> >>> that big.
> >>>
> >> Even if that is true now, you want it to be true for all future time,
> >> and all to support an out-of-tree piece of code. All of this is
> >> virtually impossible to test for without said out-of-tree piece of code,
> >> so I will object to it anyway.
> >>
> >
> > I think we are confusing two very distinct topics here :
> >
> > LTTng is currently out-of-tree. Yes. It does not have to stay like this
> > in the future. Or it can be a different tracer, like ftrace for
> > instance.
> >
> > LTTng can be built as modules. This is very likely to stay like this
> > even if LTTng (or parts of) are merged. Or as a different tracer is
> > merged. The reason why building a tracer as a module is convenient for
> > users has been expressed in a previous mail.
> >
> > So now you argue that it should not be made easy to implement
> > tracers/profilers so they can be built as kernel modules because the
> > LTTng tracer is out-of-tree. I'm sorry, but I really don't follow your
> > line of reasoning.
> >
> > So let's say we merge tracer or profiler code into the mainline kernel
> > and permit that code to be built as module, hence enable testing within
> > the mainline tree, would you be fine with this?
> >
> I'm saying that imposing constraints on kernel code has cost. The cost
> may not be immediately evident, but it constraints the kernel
> development going forward, potentially for all times. It is
> particularly obnoxious with out-of-tree users, because it is impossible
> to fix up those users to deal with a new interface, or even know what
> their constraints really are.
> This is part of the fundamental problem that Xen causes, for example.

Agreed. And I agree that mainlining such users is a big part of the
answer, because then it makes the whole community aware of their
(ab)uses. However, wrt the specific case discussed here, I prefer by far
adding a reentrancy constraint on a very well defined path of the trap
handler if it permits to simplify a bunch of in-kernel users. This added
encapsulation of architecture corner-cases will eventually make overall
kernel development _simpler_, not harder.

Now if such encapsulation has an unbearable runtime cost, fine, we're
big boys and we can tweak the kernel code to deal on a case-by-case
basis with these corner-cases. However this kind of approach is usually
more error-prone.

So, in summary :

- near-zero measurable runtime cost.
- NMI-reentrancy constraint on a very small and well-defined trap
handler code path.
- simplifies life of tracer and profilers. (meaning : makes a lot of
_other_ kernel code much easier to write and understand)
- removes ad-hoc corner cases management from those users.
- provides early error detection because the nmi-reentrant code path is
shared by all users.

So I'll use your own argument : making this trap handler code path
nmi-reentrant will simplify an already existing bunch of in-kernel users
(oprofile, perf counter tool, ftrace..). Moving the burden from
subsystems spread across the kernel tree to a single, well defined spot
looks like a constraint that will _diminish_ overall kernel development


> -hpa

Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at