Re: [patch 1/2] x86_64 page fault NMI-safe

From: Frederic Weisbecker
Date: Wed Jul 14 2010 - 18:49:05 EST


On Wed, Jul 14, 2010 at 06:31:07PM -0400, Mathieu Desnoyers wrote:
> * Frederic Weisbecker (fweisbec@xxxxxxxxx) wrote:
> > On Wed, Jul 14, 2010 at 12:54:19PM -0700, Linus Torvalds wrote:
> > > On Wed, Jul 14, 2010 at 12:36 PM, Frederic Weisbecker
> > > <fweisbec@xxxxxxxxx> wrote:
> > > >
> > > > There is also the fact we need to handle the lost NMI, by defering its
> > > > treatment or so. That adds even more complexity.
> > >
> > > I don't think your read my proposal very deeply. It already handles
> > > them by taking a fault on the iret of the first one (that's why we
> > > point to the stack frame - so that we can corrupt it and force a
> > > fault).
> >
> >
> > Ah right, I missed this part.
>
> Hrm, Frederic, I hate to ask that but.. what are you doing with those percpu 8k
> data structures exactly ? :)
>
> Mathieu



So, when an event triggers in perf, we sometimes want to capture the stacktrace
that led to the event.

We want this stacktrace (here we call that a callchain) to be recorded
locklessly. So we want this callchain buffer per cpu, with the following
type:

#define PERF_MAX_STACK_DEPTH 255

struct perf_callchain_entry {
__u64 nr;
__u64 ip[PERF_MAX_STACK_DEPTH];
};


That makes 2048 bytes. But per cpu is not enough for the callchain to be recorded
locklessly, we also need one buffer per context: task, softirq, hardirq, nmi, as
an event can trigger in any of these.
Since we disable preemption, none of these contexts can nest locally. In
fact hardirqs can nest but we just don't care about this corner case.

So, it makes 2048 * 4 = 8192 bytes. And that per cpu.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/