Re: [GIT pull] x86 APIC updates for 4.15

From: Maarten Lankhorst
Date: Thu Nov 30 2017 - 07:04:23 EST


Op 30-11-17 om 10:18 schreef Thomas Gleixner:
> Maarten,
>
> On Wed, 29 Nov 2017, Maarten Lankhorst wrote:
>> The changes to interrupts bring down our CI during hibernate, see:
>>
>> https://bugs.freedesktop.org/show_bug.cgi?id=103712
>>
>> I created a bug report at https://bugzilla.kernel.org/show_bug.cgi?id=198033
>>
>> Short reproducer:
>>
>> Create a swapfile on a snb 2600, attempt to hibernate to it with echo
>> disk > /sys/power/state, this will fail in the end, but will go through
>> most of the steps.
>>
>> After the almost complete hibernate, i915 will not receive irqs any more,
>> which kills our entire integration testing.
>>
>> Kernel config is available at
>> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3402/kernel.config.bz2
>> Results with pull request reverted at
>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7333/shards.html
>>
>> First bad commit:
>>
>> commit fdba46ffb4c203b6e6794163493fd310f98bb4be (HEAD, refs/bisect/bad)
>> Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>> Date: Wed Sep 13 23:29:27 2017 +0200
>>
>> x86/apic: Get rid of multi CPU affinity
>>
> < SNIP >
>> Could you have a look at it please?
> I had a look at it. Do I need to do anything else? :)
>
> Seriously. Can you please do the following:
>
> 1) Enable CONFIG_GENERIC_IRQ_DEBUGFS
>
> 2) mount debugfs
>
> 3) Before suspend collect information from there
>
> cat /sys/kernel/debug/irq/domains/*
>
> and
>
> cat /sys/kernel/debug/irq/irqs/$N
>
> where $N is the interrupt which does not fire anymore
>
> 4) suspend/resume
>
> 5) Collect the same data as in #3

/proc/interrupts:
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
28: 999 0 0 0 29 0 0 0 PCI-MSI 32768-edge i915

# grep . /sys/kernel/debug/irq/domains/*
/sys/kernel/debug/irq/domains/default:name: VECTOR
/sys/kernel/debug/irq/domains/default: size: 0
/sys/kernel/debug/irq/domains/default: mapped: 27
/sys/kernel/debug/irq/domains/default: flags: 0x00000041
/sys/kernel/debug/irq/domains/IO-APIC-0:name: IO-APIC-0
/sys/kernel/debug/irq/domains/IO-APIC-0: size: 24
/sys/kernel/debug/irq/domains/IO-APIC-0: mapped: 20
/sys/kernel/debug/irq/domains/IO-APIC-0: flags: 0x00000041
/sys/kernel/debug/irq/domains/IO-APIC-0: parent: VECTOR
/sys/kernel/debug/irq/domains/IO-APIC-0: name: VECTOR
/sys/kernel/debug/irq/domains/IO-APIC-0: size: 0
/sys/kernel/debug/irq/domains/IO-APIC-0: mapped: 27
/sys/kernel/debug/irq/domains/IO-APIC-0: flags: 0x00000041
/sys/kernel/debug/irq/domains/PCI-HT:name: PCI-HT
/sys/kernel/debug/irq/domains/PCI-HT: size: 0
/sys/kernel/debug/irq/domains/PCI-HT: mapped: 0
/sys/kernel/debug/irq/domains/PCI-HT: flags: 0x00000041
/sys/kernel/debug/irq/domains/PCI-HT: parent: VECTOR
/sys/kernel/debug/irq/domains/PCI-HT: name: VECTOR
/sys/kernel/debug/irq/domains/PCI-HT: size: 0
/sys/kernel/debug/irq/domains/PCI-HT: mapped: 27
/sys/kernel/debug/irq/domains/PCI-HT: flags: 0x00000041
/sys/kernel/debug/irq/domains/PCI-MSI-2:name: PCI-MSI-2
/sys/kernel/debug/irq/domains/PCI-MSI-2: size: 0
/sys/kernel/debug/irq/domains/PCI-MSI-2: mapped: 7
/sys/kernel/debug/irq/domains/PCI-MSI-2: flags: 0x00000051
/sys/kernel/debug/irq/domains/PCI-MSI-2: parent: VECTOR
/sys/kernel/debug/irq/domains/PCI-MSI-2: name: VECTOR
/sys/kernel/debug/irq/domains/PCI-MSI-2: size: 0
/sys/kernel/debug/irq/domains/PCI-MSI-2: mapped: 27
/sys/kernel/debug/irq/domains/PCI-MSI-2: flags: 0x00000041
/sys/kernel/debug/irq/domains/VECTOR:name: VECTOR
/sys/kernel/debug/irq/domains/VECTOR: size: 0
/sys/kernel/debug/irq/domains/VECTOR: mapped: 27
/sys/kernel/debug/irq/domains/VECTOR: flags: 0x00000041


# cat /sys/kernel/debug/irq/irqs/28
handler: handle_edge_irq
device: 0000:00:02.0
status: 0x00000000
istate: 0x00000000
ddepth: 0
wdepth: 0
dstate: 0x01401200
IRQD_ACTIVATED
IRQD_IRQ_STARTED
IRQD_SINGLE_TARGET
IRQD_AFFINITY_SET
node: 0
affinity: 4
effectiv: 4
pending:
domain: PCI-MSI-2
hwirq: 0x8000
chip: PCI-MSI
flags: 0x10
IRQCHIP_SKIP_SET_WAKE
parent:
domain: VECTOR
hwirq: 0x1c
chip: APIC
flags: 0x0

<explosion happens here>

# grep . /sys/kernel/debug/irq/domains/*
/sys/kernel/debug/irq/domains/default:name: VECTOR
/sys/kernel/debug/irq/domains/default: size: 0
/sys/kernel/debug/irq/domains/default: mapped: 27
/sys/kernel/debug/irq/domains/default: flags: 0x00000041
/sys/kernel/debug/irq/domains/IO-APIC-0:name: IO-APIC-0
/sys/kernel/debug/irq/domains/IO-APIC-0: size: 24
/sys/kernel/debug/irq/domains/IO-APIC-0: mapped: 20
/sys/kernel/debug/irq/domains/IO-APIC-0: flags: 0x00000041
/sys/kernel/debug/irq/domains/IO-APIC-0: parent: VECTOR
/sys/kernel/debug/irq/domains/IO-APIC-0: name: VECTOR
/sys/kernel/debug/irq/domains/IO-APIC-0: size: 0
/sys/kernel/debug/irq/domains/IO-APIC-0: mapped: 27
/sys/kernel/debug/irq/domains/IO-APIC-0: flags: 0x00000041
/sys/kernel/debug/irq/domains/PCI-HT:name: PCI-HT
/sys/kernel/debug/irq/domains/PCI-HT: size: 0
/sys/kernel/debug/irq/domains/PCI-HT: mapped: 0
/sys/kernel/debug/irq/domains/PCI-HT: flags: 0x00000041
/sys/kernel/debug/irq/domains/PCI-HT: parent: VECTOR
/sys/kernel/debug/irq/domains/PCI-HT: name: VECTOR
/sys/kernel/debug/irq/domains/PCI-HT: size: 0
/sys/kernel/debug/irq/domains/PCI-HT: mapped: 27
/sys/kernel/debug/irq/domains/PCI-HT: flags: 0x00000041
/sys/kernel/debug/irq/domains/PCI-MSI-2:name: PCI-MSI-2
/sys/kernel/debug/irq/domains/PCI-MSI-2: size: 0
/sys/kernel/debug/irq/domains/PCI-MSI-2: mapped: 7
/sys/kernel/debug/irq/domains/PCI-MSI-2: flags: 0x00000051
/sys/kernel/debug/irq/domains/PCI-MSI-2: parent: VECTOR
/sys/kernel/debug/irq/domains/PCI-MSI-2: name: VECTOR
/sys/kernel/debug/irq/domains/PCI-MSI-2: size: 0
/sys/kernel/debug/irq/domains/PCI-MSI-2: mapped: 27
/sys/kernel/debug/irq/domains/PCI-MSI-2: flags: 0x00000041
/sys/kernel/debug/irq/domains/VECTOR:name: VECTOR
/sys/kernel/debug/irq/domains/VECTOR: size: 0
/sys/kernel/debug/irq/domains/VECTOR: mapped: 27
/sys/kernel/debug/irq/domains/VECTOR: flags: 0x00000041

# cat /sys/kernel/debug/irq/irqs/28
handler: handle_edge_irq
device: 0000:00:02.0
status: 0x00000400
_IRQ_NOPROBE
istate: 0x00000000
ddepth: 0
wdepth: 0
dstate: 0x01401300
IRQD_ACTIVATED
IRQD_IRQ_STARTED
IRQD_SINGLE_TARGET
IRQD_AFFINITY_SET
IRQD_SETAFFINITY_PENDING
node: 0
affinity: 0,5-7
effectiv: 0
pending: 4
domain: PCI-MSI-2
hwirq: 0x8000
chip: PCI-MSI
flags: 0x10
IRQCHIP_SKIP_SET_WAKE
parent:
domain: VECTOR
hwirq: 0x1c
chip: APIC
flags: 0x0

Thanks for looking,
~Maarten