Re: [POSSIBLE BUG] behavior change in irq_can_handle_pm() introduced in 8d39d6ec4db5d

Next message: Yosry Ahmed: "Re: [PATCH v2 2/2] KVM: SVM: Don't set GIF when clearing EFER.SVME"
Previous message: George Anthony Vernon: "Re: [PATCH v2 2/2] hfs: Update sanity check of the root record"
In reply to: Luigi Rizzo: "[POSSIBLE BUG] behavior change in irq_can_handle_pm() introduced in 8d39d6ec4db5d"
Next in thread: Luigi Rizzo: "Re: [POSSIBLE BUG] behavior change in irq_can_handle_pm() introduced in 8d39d6ec4db5d"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Luigi Rizzo

Date: Mon Nov 10 2025 - 19:26:48 EST

On Sat, Nov 8, 2025 at 10:30 PM Luigi Rizzo <lrizzo@xxxxxxxxxx> wrote:
>
> BACKGROUND (just to explain how I found the issue; it may exist regardless):
>
> I have some code (soon to be posted here) to implement interrupt moderation
> in software using using per-CPU hrtimers. The basic logic is the following:
>
> - if the system decides an irq needs moderation, it calls disable_irq_nosync(),
> adds the irq_desc in a per-cpu list, and keeps IRQD_IRQ_INPROGRESS set
> to prevent migration. The first desc inserted in the list also start
> an hrtimer;
>
> - when the timer fires, the callback clears the bit and calls enable_irq()
> on all linked irq_desc's
>
> The relevant code is the following:
>
> @@ -207,x +208,x @@ irqreturn_t handle_irq_event(struct irq_desc *desc)
>
> raw_spin_lock(&desc->lock);
> + /* if moderation kicks in, disable_irq_nosync() and set an
> hrtimer. Keep the bit set to prevent migration */
> + if (irq_moderation_has_started_timer_and_disabled_irq(desc))
> + return ret;
> irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
> return ret;
...

after further debugging, I found that the problem is that disable_irq_nosync()
operates lazily. It marks the interrupt as disabled but leaves it on, acting on
the chip only at the next interrupt. With this change
8d39d6ec4db5d genirq: Prevent migration live lock in handle_edge_irq()
the next interrupt will find IRQD_IRQ_INPROGRESS set, and block
until the flag is clear, but that could only happen if the timer handler were
allowed to run on the same CPU.

I guess the problem can be avoided by calling
irq_set_status_flags(irq, IRQ_DISABLE_UNLAZY);
on the interrupts where I want to use the my changes in handle_irq_event()

However I still wonder if the change of behavior is intentional or an undesired
side effect

thanks
luigi