Re: [discuss] 2.6.19-rc5: known regressions (v2)

From: Paolo Ornati
Date: Sat Nov 11 2006 - 07:34:15 EST


On Sat, 11 Nov 2006 11:49:26 +0100
"Rafael J. Wysocki" <rjw@xxxxxxx> wrote:

> > > > Subject : BUG: scheduling while atomic: events/0/0x00000001/4
> > > > after resume
> > > > References : http://lkml.org/lkml/2006/11/2/209
> > > > Submitter : Paolo Ornati <ornati@xxxxxxxxxxxxx>
> > > > Status : unknown
> > >
> > > I couldn't find anything in the report that would indicate the problem occured
> > > after a resume. Was it really the case?
> >
> > Ahh, I've written that in another email but I trimmed LKML from CC by
> > mistake ;)
> >
> >
> > Relevant portion of that mail follows... anyway it seems that "-rc5" is
> > _OK_ since I'm running it by 2 days and it survived 9 suspend/resume
> > cycles.
>
> Okay, please let us know if it survives the next several cycles.
>
> OTOH, the problem may be hiding.

Ok, and if it survives againg and again I can do a partial bisection...

so that someone could guess the change that hides/fixes this and I can
revert it on top of "-rc5" to confirm.


>
> > ------------------------------------------------------------------
> >
> > I've reproduced it (with rc4-g4b1c46a3), and I think it is
> > suspend/resume related sice the messages start flooding dmesg just
> > after a resume...
> >
> > I'll see if it is reproducible just doing suspend/resume a couple of
> > times... and if so I'll try with -rc5.
> >
> >
> > dmesg (stripped at the end):
> >
> > [ 0.000000] Linux version 2.6.19-rc4-g4b1c46a3 (paolo@tux) (gcc version 4.1.1 (Gentoo 4.1.1)) #17 PREEMPT Wed Nov 1 18:36:28 CET 2006

[CUT]

> > [ 25.382084] BUG: scheduling while atomic: events/0/0x00000001/4
> > [ 25.382086]
> > [ 25.382087] Call Trace:
> > [ 25.382097] [<ffffffff8049fafb>] __sched_text_start+0x5b/0x4cc
> > [ 25.382102] [<ffffffff802f34b6>] list_add+0xc/0xe
> > [ 25.382107] [<ffffffff80236519>] worker_thread+0x0/0x11b
> > [ 25.382110] [<ffffffff802365ce>] worker_thread+0xb5/0x11b
> > [ 25.382115] [<ffffffff802233e2>] default_wake_function+0x0/0xf
> > [ 25.382119] [<ffffffff80236519>] worker_thread+0x0/0x11b
> > [ 25.382124] [<ffffffff80239269>] kthread+0xce/0x101
> > [ 25.382128] [<ffffffff802234b1>] schedule_tail+0x30/0xa2
> > [ 25.382132] [<ffffffff8020a238>] child_rip+0xa/0x12
> > [ 25.382137] [<ffffffff8023919b>] kthread+0x0/0x101
> > [ 25.382140] [<ffffffff8020a22e>] child_rip+0x0/0x12
>
> Apparently, the kernel thinks that worker_thread() is running in the atomic
> context, so there may be a problem with preempt_count(), for example.
>
> Is preemption enabled in your kernel(s)?

YES (see first line of dmesg) - full config attached

--
Paolo Ornati
Linux 2.6.19-rc5 on x86_64

Attachment: CONFIG.gz
Description: GNU Zip compressed data