Re: +generic-ipi-fix-the-race-between-generic_smp_call_function_-and-hotplug_cfd.patchadded to -mm tree

From: Suresh Siddha
Date: Mon Sep 14 2009 - 20:18:12 EST


On Mon, 2009-09-14 at 00:22 -0700, Xiao Guangrong wrote:
>
> Suresh Siddha wrote:
>
> >> CPU A call smp_call_function_many(wait=0) that want CPU B to call
> >> a specific function, after smp_call_function_many() return, we let
> >> CPU A offline immediately. Unfortunately, if CPU B receives this
> >> IPI interrupt after CPU A down, it will crash like above description.
> >
> > How can cpu B receive the IPI interrupt after cpu A is down?
> >
> > As part of the cpu A going down, we first do the stop machine. i.e.,
> > schedule the stop machine worker threads on each cpu. So, by the time
> > all the worker threads on all the cpu's get scheduled and synchronized,
> > ipi on B should get delivered.
> >
>
> Actually, those two examples have the same reason, that is how long
> the destination CPU will receive the IPI interruption?
>
> If the stop machine threads can schedule in CPU B during the IPI
> interruption delivering, It will occur those issue.
>
> I understand what you say but let me confuse is how we ensure it? The IPI
> interruption is delivered over the APIC bus, It need several CPU instruction
> cycle I guess, I also have read the spec of Intel 64 and IA-32, but not find
> the answer, could you point out for me?

Xiao, There is quite a bit of time between the time a particular cpu
sends a smp call function IPI (with wait == 0) and the time that cpu
starts running the stop machine thread and the whole system proceeds
with stop machine. With in this time, typically the smp call function
destination will receive the ipi interrupt. But theoretically the
problem you explain might happen.

>From P4 onwards, interrupts are delivered over system bus and with NHM
it is QPI. Also, the mechanism of scheduling the stop machine thread on
a particular cpu involves resched IPI etc.

Nevertheless, Have you seen a real hang or system crash due to this? If
so, on what platform?

Ideally, for xapic based platform, clear status of sender APIC ICR's
delivery status indicates that the interrupt is registered at the
receiver. for x2apic based platform, sending another interrupt will
ensure that the previous interrupt was delivered.

If you have indeed seen a crash related to this, can you review and give
the appended patch a try and see if it fixes the issue? If you agree
with the fix, then I will send the patch with a detailed change log etc.

Your current fix is not clean and not complete in my opinion (as calling
interrupt handlers manually and not doing the callbacks etc might cause
other side affects). Thanks.
---

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 9e3d8af..69ec2a9 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -93,8 +93,9 @@ void generic_smp_call_function_single_interrupt(void);
void generic_smp_call_function_interrupt(void);
void ipi_call_lock(void);
void ipi_call_unlock(void);
void ipi_call_lock_irq(void);
void ipi_call_unlock_irq(void);
+void quiesce_smp_call_functions(void);
#endif

/*
diff --git a/kernel/smp.c b/kernel/smp.c
index 8e21850..d13a888 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -479,6 +479,27 @@ int smp_call_function(void (*func)(void *), void *info, int wait)
}
EXPORT_SYMBOL(smp_call_function);

+void quiesce_smp_call_functions(void)
+{
+ struct call_single_queue *q = &__get_cpu_var(call_single_queue);
+ bool empty;
+ unsigned long flags;
+
+ do {
+ cpu_relax();
+ spin_lock_irqsave(&q->lock, flags);
+ empty = list_empty(&q->list);
+ spin_unlock_irqrestore(&q->lock, flags);
+ } while (!empty);
+
+ do {
+ cpu_relax();
+ spin_lock_irqsave(&call_function.lock, flags);
+ empty = list_empty(&call_function.queue);
+ spin_unlock_irqrestore(&call_function.lock, flags);
+ } while (!empty);
+}
+
void ipi_call_lock(void)
{
spin_lock(&call_function.lock);
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 912823e..dd2d90f 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -86,6 +86,9 @@ static void stop_cpu(struct work_struct *unused)
curstate = state;
switch (curstate) {
case STOPMACHINE_DISABLE_IRQ:
+#ifdef CONFIG_USE_GENERIC_SMP_HELPERS
+ quiesce_smp_call_functions();
+#endif
local_irq_disable();
hard_irq_disable();
break;




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/