Re: [PATCH 4/4][RFC v2] x86/apic: Spread the vectors by choosing the idlest CPU

From: Dan Williams
Date: Thu Sep 07 2017 - 02:23:47 EST


On Wed, Sep 6, 2017 at 10:59 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Wed, 6 Sep 2017, Dan Williams wrote:
>
>> On Tue, Sep 5, 2017 at 11:15 PM, Christoph Hellwig <hch@xxxxxx> wrote:
>> > On Wed, Sep 06, 2017 at 12:13:38PM +0800, Yu Chen wrote:
>> >> I agree, the driver could be rewritten, but it might take some time, so
>> >> meanwhile I'm looking at also other possible optimization.
>> >
>> > Which driver are we talking about anyway? Let's start looking at it
>> > and fix the issue there.
>>
>> As far as I understand, it's already fixed there:
>>
>> commit 7c9ae7f053e9e896c24fd23595ba369a5fe322e1
>
> -ENOSUCHCOMMIT

Sorry, that's still pending in -next.

>> Author: Carolyn Wyborny <carolyn.wyborny@xxxxxxxxx>
>> Date: Tue Jun 20 15:16:53 2017 -0700
>>
>> i40e: Fix for trace found with S4 state
>>
>> This patch fixes a problem found in systems when entering
>> S4 state. This patch fixes the problem by ensuring that
>> the misc vector's IRQ is disabled as well. Without this
>> patch a stack trace can be seen upon entering S4 state.
>>
>> However this seems like something that should be handled generically
>> in the irq-core especially since commit c5cb83bb337c
>> "genirq/cpuhotplug: Handle managed IRQs on CPU hotplug" was headed in
>> that direction. It's otherwise non-obvious when a driver needs to
>> release and re-acquire interrupts or be reworked to use managed
>> interrupts.
>
> There are two problems here:
>
> 1) The driver allocates 300 interrupts and uses exactly 8 randomly chosen
> ones.
>
> 2) It's not using the managed affinity mechanics, so the interrupts cannot
> be sanely handled by the kernel, neither affinity wise nor at hotplug
> time.

Ok, this driver is an obvious candidate, but is there a general
guideline of when a driver must use affinity management? Should we be
emitting a message when a driver exceeds a certain threshold of
unmanaged interrupts to flag this in the future?