Re: the maxcpus= boot parameter broke somewhere along the line

From: Srivatsa S. Bhat
Date: Fri Mar 09 2012 - 06:23:29 EST


On 03/08/2012 12:44 AM, Jeff Moyer wrote:

> "Srivatsa S. Bhat" <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> writes:
>
>> On 03/06/2012 11:38 PM, Jeff Moyer wrote:
>>
>>> Sasha Levin <levinsasha928@xxxxxxxxx> writes:
>>>
>>>> I can't reproduce it locally with a 3.3-rc5 kernel.
>>>
>>> First, thanks for looking into it. I just did a git pull, up to -rc6,
>>> and the problem still persists on my machine.
>>>
>>
>>
>> I tried 3.3-rc4 as well as 3.3-rc6+ (last commit dac12d1). I did not
>> see the problem in either case.
>
> I bisected the issue, and it landed here:
>
> 8a25a2fd126c621f44f3aeaef80d51f00fc11639 is the first bad commit
> commit 8a25a2fd126c621f44f3aeaef80d51f00fc11639
> Author: Kay Sievers <kay.sievers@xxxxxxxx>
> Date: Wed Dec 21 14:29:42 2011 -0800
>
> cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular
> subsystem
>
> Unfortunately, that's a HUGE commit.
>


This was from your dmesg:

sd 0:0:10:1: [sdk] Attached SCSI disk
readahead: starting
udev: starting version 147
SMP alternatives: switching to SMP code
WARNING! power/level is deprecated; use power/control instead
EDAC MC: Ver: 2.1.0
Booting Node 0 Processor 3 APIC 0x3
smpboot cpu 3: start_ip = 9a000
EDAC MC0: Giving out device to 'i3200_edac' 'i3200': DEV 0000:00:00.0
NMI watchdog enabled, takes one hw-pmu counter.
Booting Node 0 Processor 2 APIC 0x1
smpboot cpu 2: start_ip = 9a000
NMI watchdog enabled, takes one hw-pmu counter.
Booting Node 0 Processor 1 APIC 0x2
smpboot cpu 1: start_ip = 9a000
NMI watchdog enabled, takes one hw-pmu counter.


Looking at the mention of udev above, and considering the commit you bisected
to, I think it would be good to see whether someone is writing 1 to
/sys/device/system/cpu/cpu*/online and hence the cpus are getting hot-added
towards the end of boot. Maybe that sounds stupid, but worth a try :)

So can you try the debug patch below? It applies on latest linux-3.3-rc6+

---

drivers/base/cpu.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)


diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 4dabf50..49d5f83 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -43,11 +43,13 @@ static ssize_t __ref store_online(struct device *dev,
cpu_hotplug_driver_lock();
switch (buf[0]) {
case '0':
+ printk("CPU %d offline initated from userspace\n", cpu->dev.id);
ret = cpu_down(cpu->dev.id);
if (!ret)
kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
break;
case '1':
+ printk("CPU %d online initated from userspace\n", cpu->dev.id);
ret = cpu_up(cpu->dev.id);
if (!ret)
kobject_uevent(&dev->kobj, KOBJ_ONLINE);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/