RE: [RFC 06/12] genirq: Add per-cpu flow handler with conditional IRQ stats

From: Thomas Gleixner
Date: Thu Jun 06 2024 - 05:42:05 EST


On Thu, Jun 06 2024 at 03:14, Michael Kelley wrote:
> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Sent: Wednesday, June 5, 2024 7:20 AM
>>
>> On Wed, Jun 05 2024 at 13:45, Michael Kelley wrote:
>> > From: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Sent: Wednesday, June 5, 2024 6:20 AM
>> >
>> > In /proc/interrupts, the double-counting isn't a problem, and is
>> > potentially helpful as you say. But /proc/stat, for example, shows a total
>> > interrupt count, which will be roughly double what it was before. That
>> > /proc/stat value then shows up in user space in vmstat, for example.
>> > That's what I was concerned about, though it's not a huge problem in
>> > the grand scheme of things.
>>
>> That's trivial to solve. We can mark interrupts to be excluded from
>> /proc/stat accounting.
>>
>
> OK. On x86, some simple #ifdef'ery in arch_irq_stat_cpu() can filter
> out the HYP interrupts. But what do you envision on arm64, where
> there is no arch_irq_stat_cpu()? On arm64, the top-level interrupt is a
> normal Linux IRQ, and its count is included in the "kstat.irqs_sum" field
> with no breakout by IRQ. Identifying the right IRQ and subtracting it
> out later looks a lot uglier than the conditional stats accounting.

Sure. There are two ways to solve that:

1) Introduce a IRQ_NO_PER_CPU_STATS flag, mark the interrupt
accordingly and make the stats increment conditional on it.
The downside is that the conditional affects every interrupt.

2) Do something like this:

static inline
void __handle_percpu_irq(struct irq_desc *desc, irqreturn_t (*handle)(struct irq_desc *))
{
struct irq_chip *chip = irq_desc_get_chip(desc);

if (chip->irq_ack)
chip->irq_ack(&desc->irq_data);

handle(desc);

if (chip->irq_eoi)
chip->irq_eoi(&desc->irq_data);
}

void handle_percpu_irq(struct irq_desc *desc)
{
/*
* PER CPU interrupts are not serialized. Do not touch
* desc->tot_count.
*/
__kstat_incr_irqs_this_cpu(desc);
__handle_percpu_irq(desc, handle_irq_event_percpu);
}

void handle_percpu_irq_nostat(struct irq_desc *desc)
{
__this_cpu_inc(desc->kstat_irqs->cnt);
__handle_percpu_irq(desc, __handle_irq_event_percpu);
}

So that keeps the interrupt accounted for in /proc/interrupts. If you
don't want that remove the __this_cpu_inc() and mark the interrupt with
irq_set_status_flags(irq, IRQ_HIDDEN). That will exclude it from
/proc/interrupts too.

Thanks,

tglx