Re: [PATCH v3 00/10] x86: ORC unwinder (previously undwarf)

From: Ingo Molnar
Date: Thu Jul 13 2017 - 05:19:26 EST

Next message: Viresh Kumar: "Re: [PATCH] cpufreq: dt: Add zynqmp to the cpufreq dt platdev"
Previous message: Michal Simek: "[PATCH 2/2] pinctrl: zynq: Fix warnings in the driver"
In reply to: Peter Zijlstra: "Re: [PATCH v3 00/10] x86: ORC unwinder (previously undwarf)"
Next in thread: Josh Poimboeuf: "Re: [PATCH v3 00/10] x86: ORC unwinder (previously undwarf)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> > One gloriously ugly hack would be to delay the userspace unwind to
> > return-to-userspace, at which point we have a schedulable context and can take
> > faults.

I don't think it's ugly, and it has various advantages:

> > Of course, then you have to somehow identify this later unwind sample with all
> > relevant prior samples and stitch the whole thing back together, but that
> > should be doable.
> >
> > In fact, it would not be at all hard to do, just queue a task_work from the
> > NMI and have that do the EH based unwind.

This would have a couple of advantages:

- as you mention, being able to fault in debug info and generally do
IO/scheduling,

- profiling overhead would be accounted to the task context that generates it,
not the NMI context,

- there would be a natural batching/coalescing optimization if multiple events
hit the same system call: the user-space backtrace would only have to be looked
up once for all samples that got collected.

This could be done by separating the user-space backtrace into a separate event,
and perf tooling would then apply the same user-space backtrace to all prior
kernel samples.

I.e. the ring-buffer would have trace entries like:

[ kernel sample #1, with kernel backtrace #1 ]
[ kernel sample #2, with kernel backtrace #2 ]
[ kernel sample #3, with kernel backtrace #3 ]
[ user-space backtrace #1 at syscall return ]
...

Note how the three kernel samples didn't have to do any user-space unwinding at
all, so the user-space unwinding overhead got reduced by a factor of 3.

Tooling would know that 'user-space backtrace #1' applies to the previous three
kernel samples.

Or so?

Thanks,

Ingo

Next message: Viresh Kumar: "Re: [PATCH] cpufreq: dt: Add zynqmp to the cpufreq dt platdev"
Previous message: Michal Simek: "[PATCH 2/2] pinctrl: zynq: Fix warnings in the driver"
In reply to: Peter Zijlstra: "Re: [PATCH v3 00/10] x86: ORC unwinder (previously undwarf)"
Next in thread: Josh Poimboeuf: "Re: [PATCH v3 00/10] x86: ORC unwinder (previously undwarf)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]