Re: [PATCH V3 4/4] genirq/affinity: irq vector spread among online CPUs as far as possible

From: Ming Lei
Date: Wed Apr 04 2018 - 11:20:42 EST


On Wed, Apr 04, 2018 at 02:45:18PM +0200, Thomas Gleixner wrote:
> On Wed, 4 Apr 2018, Thomas Gleixner wrote:
> > I'm aware how that hw-queue stuff works. But that only works if the
> > spreading algorithm makes the interrupts affine to offline/not-present CPUs
> > when the block device is initialized.
> >
> > In the example above:
> >
> > > > > irq 39, cpu list 0,4
> > > > > irq 40, cpu list 1,6
> > > > > irq 41, cpu list 2,5
> > > > > irq 42, cpu list 3,7
> >
> > and assumed that at driver init time only CPU 0-3 are online then the
> > hotplug of CPU 4-7 will not result in any interrupt delivered to CPU 4-7.
> >
> > So the extra assignment to CPU 4-7 in the affinity mask has no effect
> > whatsoever and even if the spreading result is 'perfect' it just looks
> > perfect as it is not making any difference versus the original result:
> >
> > > > > irq 39, cpu list 0
> > > > > irq 40, cpu list 1
> > > > > irq 41, cpu list 2
> > > > > irq 42, cpu list 3
>
> And looking deeper into the changes, I think that the first spreading step
> has to use cpu_present_mask and not cpu_online_mask.
>
> Assume the following scenario:
>
> Machine with 8 present CPUs is booted, the 4 last CPUs are
> unplugged. Device with 4 queues is initialized.
>
> The resulting spread is going to be exactly your example:
>
> irq 39, cpu list 0,4
> irq 40, cpu list 1,6
> irq 41, cpu list 2,5
> irq 42, cpu list 3,7
>
> Now the 4 offline CPUs are plugged in again. These CPUs won't ever get an
> interrupt as all interrupts stay on CPU 0-3 unless one of these CPUs is
> unplugged. Using cpu_present_mask the spread would be:
>
> irq 39, cpu list 0,1
> irq 40, cpu list 2,3
> irq 41, cpu list 4,5
> irq 42, cpu list 6,7

Given physical CPU hotplug isn't common, this way will make only irq 39
and irq 40 active most of times, so performance regression is caused just
as Kashyap reported.

>
> while on a machine where CPU 4-7 are NOT present, but advertised as
> possible the spread would be:
>
> irq 39, cpu list 0,4
> irq 40, cpu list 1,6
> irq 41, cpu list 2,5
> irq 42, cpu list 3,7

I think this way is still better, since performance regression can be
avoided, and there is at least one CPU for covering one irq vector,
in reality, it is often enough.

As I mentioned in another email, I still don't understand why interrupts
can't be delivered to CPU 4~7 after these CPUs become present & online.
Seems in theory, interrupts should be delivered to these CPUs since
affinity info has been programmed to interrupt controller already.

Or do we still need CPU hotplug handler for device driver to tell device
the CPU hotplug change for delivering interrupts to new added CPUs?


Thanks,
Ming