Re: [PATCH] smpboot.c: move setup_vector_irq after set_cpu_online

From: Thomas Gleixner
Date: Thu Jul 02 2015 - 08:50:52 EST


On Thu, 2 Jul 2015, Zhang, Yanmin wrote:
> On 2015/7/2 15:02, Xiao, Jin wrote:
> > Hi Joerg,
> >
> > On 7/2/2015 2:52 PM, Joerg Roedel wrote:
> >> Hi Jin,
> >>
> >> On Thu, Jul 02, 2015 at 12:24:34PM +0800, xiao jin wrote:
> >>> [ 106.107851] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
> >>> [ 106.116702] IP:
> >>> [ 106.118490] [<ffffffff810044f6>] check_irq_vectors_for_cpu_disable+0x76/0x180
> >> This might be the same issue I fixed with:
> >>
> >> d97eb8966c91 x86/irq: Check for valid irq descriptor in check_irq_vectors_for_cpu_disable()
> >>
> >> Can you try whether applying this patch on your kernel fixes your
> >> issue?
> >>
> > Yes, I agree d97eb8966c91 fix the IPANIC. I just notice a description from http://lkml.kernel.org/r/20150204132754.GA10078@xxxxxxxx
> >
> > I wish to share the debug result and root cause from my side.
>
> commit d97eb8966c91f2c9d05f0a22eb89ed5b76d966d1 looks like a workaround.
>
> Could Jin's patch be merged also?
> free_msi_irqs does have a race with smp_callin=>..=>__setup_vector_irq.

The whole vector stuff is racy versus cpu hotplug and Jins patch
merily addresses a small part of it and by doing that it breaks stuff
as well.

With that patch we move the vector setup after marking the cpu online,
which is wrong because the vector array on that cpu is not up to date
until we call __setup_vector_irq(). Proper patch below.

We still have an issue in the cpu_disable() patch, but I haven't yet
wrapped my head around it completely.

Thanks,

tglx
---
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 28eba2d38b15..f813261d9740 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -409,12 +409,6 @@ static void __setup_vector_irq(int cpu)
int irq, vector;
struct apic_chip_data *data;

- /*
- * vector_lock will make sure that we don't run into irq vector
- * assignments that might be happening on another cpu in parallel,
- * while we setup our initial vector to irq mappings.
- */
- raw_spin_lock(&vector_lock);
/* Mark the inuse vectors */
for_each_active_irq(irq) {
data = apic_chip_data(irq_get_irq_data(irq));
@@ -436,16 +430,16 @@ static void __setup_vector_irq(int cpu)
if (!cpumask_test_cpu(cpu, data->domain))
per_cpu(vector_irq, cpu)[vector] = VECTOR_UNDEFINED;
}
- raw_spin_unlock(&vector_lock);
}

/*
- * Setup the vector to irq mappings.
+ * Setup the vector to irq mappings. Must be called with vector_lock held.
*/
void setup_vector_irq(int cpu)
{
int irq;

+ lockdep_assert_held(&vector_lock);
/*
* On most of the platforms, legacy PIC delivers the interrupts on the
* boot cpu. But there are certain platforms where PIC interrupts are
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 8add66b22f33..67dc638035d3 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -171,11 +171,6 @@ static void smp_callin(void)
apic_ap_setup();

/*
- * Need to setup vector mappings before we enable interrupts.
- */
- setup_vector_irq(smp_processor_id());
-
- /*
* Save our processor parameters. Note: this information
* is needed for clock calibration.
*/
@@ -246,11 +241,13 @@ static void notrace start_secondary(void *unused)
#endif

/*
- * We need to hold vector_lock so there the set of online cpus
- * does not change while we are assigning vectors to cpus. Holding
- * this lock ensures we don't half assign or remove an irq from a cpu.
+ * Lock vector_lock and initialize the vectors on this cpu
+ * before setting the cpu online. We must set it online with
+ * vector_lock held to prevent a concurrent setup/teardown
+ * from seeing a half valid vector space.
*/
lock_vector_lock();
+ setup_vector_irq(smp_processor_id());
set_cpu_online(smp_processor_id(), true);
unlock_vector_lock();
cpu_set_state_online(smp_processor_id());

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/