Re: [patch V6 04/37] x86: Make hardware latency tracing explicit

From: Thomas Gleixner
Date: Sun May 17 2020 - 04:49:18 EST


Andy Lutomirski <luto@xxxxxxxxxx> writes:
> On Fri, May 15, 2020 at 5:10 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>>
>>
>> The hardware latency tracer calls into trace_sched_clock and ends up in
>> various instrumentable functions which is problemeatic vs. the kprobe
>> handling especially the text poke machinery. It's invoked from
>> nmi_enter/exit(), i.e. non-instrumentable code.
>>
>> Use nmi_enter/exit_notrace() instead. These variants do not invoke the
>> hardware latency tracer which avoids chasing down complex callchains to
>> make them non-instrumentable.
>>
>> The real interesting measurement is the actual NMI handler. Add an explicit
>> invocation for the hardware latency tracer to it.
>>
>> #DB and #BP are uninteresting as they really should not be in use when
>> analzying hardware induced latencies.
>>
>
>> @@ -849,7 +851,7 @@ static void noinstr handle_debug(struct
>> static __always_inline void exc_debug_kernel(struct pt_regs *regs,
>> unsigned long dr6)
>> {
>> - nmi_enter();
>> + nmi_enter_notrace();
>
> Why can't exc_debug_kernel() handle instrumentation? We shouldn't
> recurse into #DB since we've already cleared DR7, right?

It can later on. The point is that the trace stuff calls into the world
and some more before the entry handling is complete.

Remember this is about ensuring that all the state is properly
established before any of this instrumentation muck can happen.

DR7 handling is specific to #DB and done even before nmi_enter to
prevent recursion.

Thanks,

tglx