Re: 2.6.29-rc1 does not boot

From: Mike Travis
Date: Sun Jan 11 2009 - 14:02:49 EST


Dieter Ries wrote:
> Hi,
>
> Ingo Molnar schrieb:
>>>> * Dieter Ries <clip2@xxxxxx> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I just pulled 2.6.29-rc1, ran oldconfig with defaults and built it.
>>>>> When I try to boot it, that kind of works until init should start. Then
>>>>> nothing happens. I tried with init=/bin/bash, which sometimes works, and
>>>>> sometimes gets me a bash without the prompt flashing.
>>>>>
>>>>> I captured the output with netconsole, but I cannot see a problem there.
>>>>> It is attached.
>>>>>
>>>>> My config is also attached.
>>>>>
>>>>> The machine:
>>>>>
>>>>> Lenovo Thinkpad T60
>>>>> Core2Duo 2GHz
>>>>>
>>>>> Gentoo 64bit
>>>>>
>>>>>
>>>>> What else should I provide for debugging that?
>> Unless you can see some particular badness in the kernel messages
>> (something that changed to the last working version) that narrows it down
>> to some subsystem, i suspect this would have to be bisected ...
>
> Bisected it:
>
> ####################################################################
> 7503bfbae89eba07b46441a5d1594647f6b8ab7d is first bad commit
> commit 7503bfbae89eba07b46441a5d1594647f6b8ab7d
> Author: Mike Travis <travis@xxxxxxx>
> Date: Sun Jan 4 05:18:09 2009 -0800
>
> cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
>
> Impact: use new cpumask API to reduce stack usage
>
> Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr()
> with a work_on_cpu function for drv_read() and drv_write().
>
> Basically converts do_drv_{read,write} into "work_on_cpu" functions that
> are now called by drv_read and drv_write.
>
> Signed-off-by: Mike Travis <travis@xxxxxxx>
> Acked-by: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
> Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
> ####################################################################
>
> I reverted that patch, which makes my machine boot again. So I guess
> theres something wrong here. Please tell me which information you need
> to fix the problem, I will help as I can.
>
>
>> Ingo
>>
>
> cu
> Dieter
>

Thanks for catching this.

The work_on_cpu approach seems to create more problems than it solves. And
testing it has proven difficult without the right combination of hardware.

Could you send me the console log and config file?

Rusty - any ideas on how to avoid these clashes with the get_online_cpus()
call in work_on_cpu()? Or something else to indicate to lockdep that
the circular lock dependency is ok (as you mentioned before)?

Thanks,
Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/