Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU

From: Thomas Gleixner
Date: Tue Sep 05 2017 - 18:58:01 EST


On Sun, 3 Sep 2017, Thomas Gleixner wrote:

> On Fri, 1 Sep 2017, Chen Yu wrote:
>
> > This is the major logic to spread the vectors on different CPUs.
> > The main idea is to choose the 'idlest' CPU which has assigned
> > the least number of vectors as the candidate/hint for the vector
> > allocation domain, in the hope that the vector allocation domain
> > could leverage this hint to generate corresponding cpumask.
> >
> > One of the requirements to do this vector spreading work comes from the
> > following hibernation problem found on a 16 cores server:
> >
> > CPU 31 disable failed: CPU has 62 vectors assigned and there
> > are only 0 available.

Thinking more about this, this makes no sense whatsoever.

The total number of interrupts on a system is the same whether they are
all on CPU 0 or evenly spread over all CPUs.

As this machine is using physcial destination mode, the number of vectors
used is the same as the number of interrupts, except for the case where a
move of an interrupt is in progress and the interrupt which cleans up the
old vector has not yet arrived. Lets ignore that for now.

The available vector space is 204 per CPU on such a system.

256 - SYSTEM[0-31, 32, 128, 239-255] - LEGACY[50] = 204

> > CPU 31 disable failed: CPU has 62 vectors assigned and there
> > are only 0 available.

CPU31 is the last AP going offline (CPU0 is still online).

It wants to move 62 vectors to CPU0, but it can't because CPU0 has 0
available vectors. That means CPU0 has 204 vectors used. I doubt that, but
what I doubt even more is that this interrupt spreading helps in any way.

Assumed that we have a total of 204 + 62 = 266 device interrupt vectors in
use and they are evenly spread over 32 CPUs, so each CPU has either 8 or
nine vectors. Fine.

Now if you unplug all CPUs except CPU0 starting from CPU1 up to CPU31 then
at the point where CPU31 is about to be unplugged, CPU0 holds 133 vectors
and CPU31 holds 133 vectors as well - assumed that the spread is exactly
even.

I have a hard time to figure out how the 133 vectors on CPU31 are now
magically fitting in the empty space on CPU0, which is 204 - 133 = 71. In
my limited understanding of math 133 is greater than 71, but your patch
might make that magically be wrong.

Can you please provide detailed information about how many device
interrupts are actually in use/allocated on that system?

Please enable CONFIG_GENERIC_IRQ_DEBUGFS and provide the output of

# cat /sys/kernel/debug/irq/domains/*

and

# ls /sys/kernel/debug/irq/irqs

Thanks,

tglx