Re: [PATCH v5 00/45] CPU hotplug: stop_machine()-free CPU hotplug
From: Srivatsa S. Bhat
Date: Mon Feb 18 2013 - 05:53:38 EST
On 02/18/2013 04:04 PM, Srivatsa S. Bhat wrote:
> On 02/18/2013 03:54 PM, Vincent Guittot wrote:
>> On 15 February 2013 20:40, Srivatsa S. Bhat
>> <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> wrote:
>>> Hi Vincent,
>>>
>>> On 02/15/2013 06:58 PM, Vincent Guittot wrote:
>>>> Hi Srivatsa,
>>>>
>>>> I have run some tests with you branch (thanks Paul for the git tree)
>>>> and you will find results below.
>>>>
>>>
>>> Thank you very much for testing this patchset!
>>>
>>>> The tests condition are:
>>>> - 5 CPUs system in 2 clusters
>>>> - The test plugs/unplugs CPU2 and it increases the system load each 20
>>>> plug/unplug sequence with either more cyclictests threads
>>>> - The test is done with all CPUs online and with only CPU0 and CPU2
>>>>
>>>> The main conclusion is that there is no differences with and without
>>>> your patches with my stress tests. I'm not sure that it was the
>>>> expected results but the cpu_down is already quite low : 4-5ms in
>>>> average
>>>>
>>>
>>> Atleast my patchset doesn't perform _worse_ than mainline, with respect
>>> to cpu_down duration :-)
>>
>> yes exactly and it has pass more than 400 consecutive plug/unplug on
>> an ARM platform
>>
>
> Great! However, did you turn on CPU_IDLE during your tests?
>
> In my tests, I had turned off cpu idle in the .config, like I had mentioned in
> the cover letter. I'm struggling to get it working with CPU_IDLE/INTEL_IDLE
> turned on, because it gets into a lockup almost immediately. It appears that
> the lock-holder of clockevents_lock never releases it, for some reason..
> See below for the full log. Lockdep has not been useful in debugging this,
> unfortunately :-(
>
Ah, nevermind, the following diff fixes it :-) I had applied this fix on v5
and tested but it still had races where I used to hit the lockups. Now after
I fixed all the memory barrier issues that Paul and Oleg pointed out in v5,
I applied this fix again and tested it just now - it works beautifully! :-)
I'll include this fix and post a v6 soon.
Regards,
Srivatsa S. Bhat
--------------------------------------------------------------------------->
diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 30b6de0..ca340fd 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -17,6 +17,7 @@
#include <linux/module.h>
#include <linux/notifier.h>
#include <linux/smp.h>
+#include <linux/cpu.h>
#include "tick-internal.h"
@@ -431,6 +432,7 @@ void clockevents_notify(unsigned long reason, void *arg)
unsigned long flags;
int cpu;
+ get_online_cpus_atomic();
raw_spin_lock_irqsave(&clockevents_lock, flags);
clockevents_do_notify(reason, arg);
@@ -459,6 +461,7 @@ void clockevents_notify(unsigned long reason, void *arg)
break;
}
raw_spin_unlock_irqrestore(&clockevents_lock, flags);
+ put_online_cpus_atomic();
}
EXPORT_SYMBOL_GPL(clockevents_notify);
#endif
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/