Re: [RFC PATCH v2 2/2] x86/perf/amd: Resolve NMI latency issues when multiple PMCs are active

From: Lendacky, Thomas
Date: Mon Mar 18 2019 - 12:40:56 EST


On 3/18/19 5:40 AM, Peter Zijlstra wrote:
> On Fri, Mar 15, 2019 at 08:41:00PM +0000, Lendacky, Thomas wrote:
>> This issue is different from a previous issue related to perf NMIs that
>> was fixed in commit:
>> 63e6be6d98e1 ("perf, x86: Catch spurious interrupts after disabling counters")
>>
>> The difference here is that the NMI latency can contribute to what appear
>> to be spurious NMIs during the handling of PMC counter overflow while the
>> counter is active as opposed to when the counter is being disabled.
>
> But could we not somehow merge these two cases? The cause is similar
> IIUC. The PMI gets triggered, but then:
>
> - a previous PMI handler handled our overflow, or
> - our event gets disabled.
>
> and when our PMI triggers it finds nothing to do.

Yes, let me look into what we can do here.

>
>> +/*
>> + * Because of NMI latency, if multiple PMC counters are active or other sources
>> + * of NMIs are received, the perf NMI handler can handle one or more overflowed
>> + * PMC counters outside of the NMI associated with the PMC overflow. If the NMI
>> + * doesn't arrive at the LAPIC in time to become a pending NMI, then the kernel
>> + * back-to-back NMI support won't be active. This PMC handler needs to take into
>> + * account that this can occur, otherwise this could result in unknown NMI
>> + * messages being issued. Examples of this is PMC overflow while in the NMI
>> + * handler when multiple PMCs are active or PMC overflow while handling some
>> + * other source of an NMI.
>> + *
>> + * Attempt to mitigate this by using the number of active PMCs to determine
>> + * whether to return NMI_HANDLED if the perf NMI handler did not handle/reset
>> + * any PMCs. The per-CPU perf_nmi_counter variable is set to a minimum of the
>> + * number of active PMCs or 2. The value of 2 is used in case an NMI does not
>> + * arrive at the LAPIC in time to be collapsed into an already pending NMI.
>> + */
>> +static int amd_pmu_handle_irq(struct pt_regs *regs)
>> +{
>> + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> + int active, handled;
>> +
>> + active = __bitmap_weight(cpuc->active_mask, X86_PMC_IDX_MAX);
>> + handled = x86_pmu_handle_irq(regs);
>> +
>> + /*
>> + * If no counters are active, reset perf_nmi_counter and return
>> + * NMI_DONE
>> + */
>> + if (!active) {
>> + this_cpu_write(perf_nmi_counter, 0);
>> + return NMI_DONE;
>> + }
>
> This will actively render 63e6be6d98e1 void I think. Because that can
> return !0 while !active -- that's rather the whole point of it.

Yup, I didn't take that into account when I changed that. Let me take a
closer look at all this along with your suggestion from the other email.

I'm traveling this week, so it might be a few days.

Thanks,
Tom

>
>> + /*
>> + * If a counter was handled, record the number of possible remaining
>> + * NMIs that can occur.
>> + */
>> + if (handled) {
>> + this_cpu_write(perf_nmi_counter,
>> + min_t(unsigned int, 2, active));
>> +
>> + return handled;
>> + }
>> +
>> + if (!this_cpu_read(perf_nmi_counter))
>> + return NMI_DONE;
>> +
>> + this_cpu_dec(perf_nmi_counter);
>> +
>> + return NMI_HANDLED;
>> +}
>
>