[tip:x86/urgent] x86/irq: Prevent force migration of irqs which are not in the vector domain

From: tip-bot for Mika Westerberg
Date: Tue Oct 04 2016 - 07:19:02 EST


Commit-ID: db91aa793ff984ac048e199ea1c54202543952fe
Gitweb: http://git.kernel.org/tip/db91aa793ff984ac048e199ea1c54202543952fe
Author: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx>
AuthorDate: Mon, 3 Oct 2016 13:17:08 +0300
Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
CommitDate: Tue, 4 Oct 2016 13:13:47 +0200

x86/irq: Prevent force migration of irqs which are not in the vector domain

When a CPU is about to be offlined we call fixup_irqs() that resets IRQ
affinities related to the CPU in question. The same thing is also done when
the system is suspended to S-states like S3 (mem).

For each IRQ we try to complete any on-going move regardless whether the
IRQ is actually part of x86_vector_domain. For each IRQ descriptor we fetch
its chip_data, assume it is of type struct apic_chip_data and manipulate it
by clearing old_domain mask etc. For irq_chips that are not part of the
x86_vector_domain, like those created by various GPIO drivers, will find
their chip_data being changed unexpectly.

Below is an example where GPIO chip owned by pinctrl-sunrisepoint.c gets
corrupted after resume:

# cat /sys/kernel/debug/gpio
gpiochip0: GPIOs 360-511, parent: platform/INT344B:00, INT344B:00:
gpio-511 ( |sysfs ) in hi

# rtcwake -s10 -mmem
<10 seconds passes>

# cat /sys/kernel/debug/gpio
gpiochip0: GPIOs 360-511, parent: platform/INT344B:00, INT344B:00:
gpio-511 ( |sysfs ) in ?

Note '?' in the output. It means the struct gpio_chip ->get function is
NULL whereas before suspend it was there.

Fix this by first checking that the IRQ belongs to x86_vector_domain before
we try to use the chip_data as struct apic_chip_data.

Reported-and-tested-by: Sakari Ailus <sakari.ailus@xxxxxxxxxxxxxxx>
Signed-off-by: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # 4.4+
Link: http://lkml.kernel.org/r/20161003101708.34795-1-mika.westerberg@xxxxxxxxxxxxxxx
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

---
arch/x86/kernel/apic/vector.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 6066d94..5d30c5e 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -661,11 +661,28 @@ void irq_complete_move(struct irq_cfg *cfg)
*/
void irq_force_complete_move(struct irq_desc *desc)
{
- struct irq_data *irqdata = irq_desc_get_irq_data(desc);
- struct apic_chip_data *data = apic_chip_data(irqdata);
- struct irq_cfg *cfg = data ? &data->cfg : NULL;
+ struct irq_data *irqdata;
+ struct apic_chip_data *data;
+ struct irq_cfg *cfg;
unsigned int cpu;

+ /*
+ * The function is called for all descriptors regardless of which
+ * irqdomain they belong to. For example if an IRQ is provided by
+ * an irq_chip as part of a GPIO driver, the chip data for that
+ * descriptor is specific to the irq_chip in question.
+ *
+ * Check first that the chip_data is what we expect
+ * (apic_chip_data) before touching it any further.
+ */
+ irqdata = irq_domain_get_irq_data(x86_vector_domain,
+ irq_desc_get_irq(desc));
+ if (!irqdata)
+ return;
+
+ data = apic_chip_data(irqdata);
+ cfg = data ? &data->cfg : NULL;
+
if (!cfg)
return;