Re: [PATCH V2 2/2] trace,x86: Add nmi to the irq_vectors class

From: Daniel Bristot de Oliveira
Date: Mon Apr 08 2019 - 08:24:55 EST


On 4/5/19 11:50 PM, Thomas Gleixner wrote:
> On Mon, 1 Apr 2019, Daniel Bristot de Oliveira wrote:
>> Currently, the irq_vector class of tracepoints does not include the NMI
>> entry. The NMI was in the first set of tracepoints for IRQs, but it was
>> dropped off because of the logic of switching IDT when enabling trace[1].
>> However, as the switching IDT logic was removed [2], it is possible to add
>
> Well, not really. tracepoints inside of do_nmi() have been there for a long
> time.
>
> The nasty part was that the irq_vector patches injected the tracepoints way
> down in the low level entry code, which caused more problems than it solved
> including the extra IDT switcheroo.

humm... the benefits of having the tracepoints at the very begin/end of the
handler is that it captures better the amount of interference/overhead it caused
in the current thread (and IRQ too in the case of NMIs).

>> the NMI tracepoints back.
>
> So again we have two tracepoints. One for the entry and one for the
> handlers. Do we really need both? If so please provide some rationale.

For example:

f-4447 [000] d.Z. 1253.162036: nmi_entry: vector=2
f-4447 [000] d.Z. 1253.162038: nmi_handler: nmi_cpu_backtrace_handler()
delta_ns: 354 handled: 0
f-4447 [000] d.Z. 1253.162267: nmi_exit: vector=2

The nmi_handler tracepoint points to an interference of 354 ns, what is very
good! But the difference of time between the two irq_vectors tracepoint is:

1253.162267-1253.162036 = 231 us!

The number looks really odd, but tracing with function tracer, we see this:

Thread running:
f-4447 [000] d.Z. 1487.689288: rcu_nmi_enter <-do_nmi
f-4447 [000] d.Z. 1487.689289: rcu_dynticks_curr_cpu_in_eqs <-rcu_nmi_enter

NMI starts:
f-4447 [000] d.Z. 1487.689290: nmi_entry: vector=2
f-4447 [000] d.Z. 1487.689291: default_do_nmi <-do_nmi
f-4447 [000] d.Z. 1487.689291: nmi_handle <-default_do_nmi
f-4447 [000] d.Z. 1487.689291: nmi_cpu_backtrace_handler <-nmi_handle

It handles very fast:
f-4447 [000] d.Z. 1487.689292: nmi_handler: nmi_cpu_backtrace_handler()
delta_ns: 780 handled: 0

But some other functions continues running in the NMI context:
[Dazed and confused, but trying to continue message ]
f-4447 [000] d.Z. 1487.689292: _raw_spin_trylock <-default_do_nmi
f-4447 [000] d.Z. 1487.689566: unknown_nmi_error <-default_do_nmi
f-4447 [000] d.Z. 1487.689566: nmi_handle <-unknown_nmi_error
f-4447 [000] d.Z. 1487.689567: printk <-unknown_nmi_error
f-4447 [000] d.Z. 1487.689567: vprintk_func <-printk
f-4447 [000] d.Z. 1487.689567: printk_safe_log_store <-vprintk_func
f-4447 [000] d.Z. 1487.689569: arch_irq_work_raise <-irq_work_queue
f-4447 [000] d.Z. 1487.689569: default_send_IPI_self <-arch_irq_work_raise
f-4447 [000] d.Z. 1487.689569: __default_send_IPI_shortcut
<-default_send_IPI_self
f-4447 [000] d.Z. 1487.689585: native_apic_wait_icr_idle <-arch_irq_work_raise
f-4447 [000] d.Z. 1487.689588: printk <-unknown_nmi_error
f-4447 [000] d.Z. 1487.689588: vprintk_func <-printk
f-4447 [000] d.Z. 1487.689588: printk_safe_log_store <-vprintk_func
f-4447 [000] d.Z. 1487.689588: printk <-unknown_nmi_error.cold.13
f-4447 [000] d.Z. 1487.689588: vprintk_func <-printk
f-4447 [000] d.Z. 1487.689588: printk_safe_log_store <-vprintk_func

Handler finishes:
f-4447 [000] d.Z. 1487.689589: nmi_exit: vector=2
f-4447 [000] d.Z. 1487.689589: rcu_nmi_exit <-do_nmi
f-4447 [000] d.Z. 1487.689589: rcu_irq_exit <-rcu_nmi_exit

And it was actually also (potentially) delaying an timer:
f-4447 [000] d... 1487.689590: smp_irq_work_interrupt <-irq_work_interrupt

So, the nmi_handle tracepoint would not make suspicious that the NMI caused the
delay, but the irq_vectors would (because of their position).

Finally, we would have to enable a single class of tracepoints to monitor all
interrupts.

PS: I am using the vanilla kernel in a vm, so the numbers are higher, but still,
they have the same scale in the bare metal.

Thoughts?

-- Daniel

>> The tracepoints looks like:
>> <idle>-0 [000] d.Z. 179.594315: nmi_entry: vector=2
>> <idle>-0 [000] d.Z. 179.594396: nmi_exit: vector=2
>>
>> [1] trace,x86: irq vector tracepoint support
>> https://lwn.net/Articles/555465/
>
> Please use https://lkml.kernel.org/r/$MESSAGE-ID
>
>> [2] commit 4b9a8dca0e58 ("x86/idt: Remove the tracing IDT completely")
>>
>> Changes from V1:
>> - Fix a compilation problem when CONFIG_X86_LOCAL_APIC is not set (kbuild test)
>
> Please put changes below the '---' separator. They are not part of the
> final changelog and having them below spares the maintainer the time to
> remove them manually.
>
> Thanks,
>
> tglx
>