Re: [RFC PATCH v4 2/4] genirq/cpuhotplug: Dynamically isolate CPUs from managed interrupts

From: Thomas Gleixner
Date: Sun Dec 01 2024 - 08:43:24 EST


On Sun, Dec 01 2024 at 14:42, Costa Shulyupin wrote:
> After change of housekeeping_cpumask(HK_TYPE_MANAGED_IRQ) during runtime
> managed interrupts continue to run on isolated CPUs.
>
> Dynamic CPUs isolation is complex task. One of approaches is:
> 1. Set affected CPUs offline and disable relevant interrupts
> 2. Change housekeeping_cpumask
> 3. Set affected CPUs online and enable relevant interrupts
>
> irq_restore_affinity_of_irq() restores managed interrupts
> during complex CPU hotplug process of bringing back a CPU online.
>
> Leave the interrupts disabled those affinity doesn't intersect
> with new housekeeping_cpumask thereby ensuring isolation
> of the CPU from managed intrrupts.

And thereby breaking drivers, which will restore the per cpu queue and
expect interrupts to work.

The semantics of HK_TYPE_MANAGED_IRQ are clearly not what you try to
make them. See the description of the "managed_irq" command line
parameter:

Isolate from being targeted by managed interrupts
which have an interrupt mask containing isolated
CPUs. The affinity of managed interrupts is
handled by the kernel and cannot be changed via
the /proc/irq/* interfaces.

This isolation is best effort and only effective
if the automatically assigned interrupt mask of a
device queue contains isolated and housekeeping
CPUs. If housekeeping CPUs are online then such
interrupts are directed to the housekeeping CPU
so that IO submitted on the housekeeping CPU
cannot disturb the isolated CPU.

If a queue's affinity mask contains only isolated
CPUs then this parameter has no effect on the
interrupt routing decision, though interrupts are
only delivered when tasks running on those
isolated CPUs submit IO. IO submitted on
housekeeping CPUs has no influence on those
queues.

It's pretty clear, no?

Thanks,

tglx