Re: 2.6.30-rc8 Oops whilst booting

From: Chris Clayton
Date: Mon Jun 08 2009 - 14:33:24 EST


2009/6/8 James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>:
> On Mon, 2009-06-08 at 09:21 -0700, Linus Torvalds wrote:
>>
>> On Mon, 8 Jun 2009, Chris Clayton wrote:
>> >
>> > OK. I reversed that change and built and installed the kernel. It has
>> > withstood 100 reboots without a panic. Additionally, I pulled the
>> > latest changes (that will be rc8-git5, I think) from kernel.org,
>> > reversed the change to that kernel and built and installed it. That
>> > too withstood 100 reboots without a panic.
>> >
>> > Let me know if there's anything else I can do to help fix this.
>>
>> That's already pretty convincing.
>>
>> James, Arjan? The original oops message is here (a jpg screen capture,
>> unable to open initial console):
>>
>>       http://lkml.org/lkml/2009/6/6/142
>>
>> and it's this bug entry:
>>
>>       Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=13474
>>       Subject         : Oops whilst booting
>>       Submitter       : Chris Clayton <chris2553@xxxxxxxxxxxxxx>
>>       Date            : 2009-06-06 18:59 (2 days old)
>>       References      : http://marc.info/?l=linux-kernel&m=124431487924254&w=4
>>
>> and now bisected down to
>>
>> >> commit d5a877e8dd409d8c702986d06485c374b705d340
>> >> Author: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>
>> >> Date:   Sun May 24 13:03:43 2009 -0700
>> >>
>> >>     async: make sure independent async domains can't accidentally entangle
>>
>> please advice. Otherwise I'll have to revert.
>
> I think it's a bug in the async code.  It's providing cookies too high
> because it doesn't stop after it finds a running entry.
>
> Can we try this as the fix?
>
> James
>
> ---
>
> diff --git a/kernel/async.c b/kernel/async.c
> index 5054030..e4909ee 100644
> --- a/kernel/async.c
> +++ b/kernel/async.c
> @@ -97,7 +97,7 @@ static async_cookie_t  __lowest_in_progress(struct list_head *running)
>        if (!list_empty(running)) {
>                entry = list_first_entry(running,
>                        struct async_entry, list);
> -               ret = entry->cookie;
> +               return entry->cookie;
>        }
>
>        if (!list_empty(&async_pending)) {
>

I can also confirm that a kernel with this patch applied has withstood
the 100-boot torture. I'll try Linus's version now and report back
asap.

Chris

--
No, Sir; there is nothing which has yet been contrived by man, by
which so much happiness is produced as by a good tavern or inn -
Doctor Samuel Johnson
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/