Re: Linux 2.6.22-rc4

From: Rafael J. Wysocki
Date: Tue Jun 05 2007 - 17:55:47 EST


On Tuesday, 5 June 2007 22:19, Pavel Machek wrote:
> Hi!
>
> > > > [ 116.733327] PM: suspend-to-disk mode set to 'shutdown' [
> > > > 116.738849] swsusp: Basic memory bitmaps created [ 116.745353]
> > > > Stopping tasks ... WARNING: at
> > > > /home/devel/linux-git/kernel/lockdep.c:2414 check_flags()
> >
> > > > [ 116.755052] irq event stamp: 69
> > > > [ 116.755060] hardirqs last enabled at (69): [<c04040f9>] syscall_exit_work+0x11/0x26
> > > > [ 116.755084] hardirqs last disabled at (68): [<c0403fdd>] syscall_exit+0x9/0x1a
> > > > [ 116.755109] softirqs last enabled at (0): [<c042150c>] copy_process+0x4dd/0x1286
> > > > [ 116.755139] softirqs last disabled at (0): [<00000000>] 0x0
> > > > [ 116.945776] done.
> >
> > > Well, it's harmless in the sense that "yeah, the system still works",
> > > but it does seem to be a real bug. We have hardware interrupts
> > > disabled when we _think_ we should have them on, so our irq tracking
> > > is off.
> > >
> > > Ingo, do you see what's up? It looks like we got a signal to a process
> > > that just got created, is the setup stuff for "tsk->hardirqs_enabled"
> > > perhaps off a bit?
> >
> > hm. I cannot see the source of the bug at the moment, but here's my
> > analysis so far:
> >
> > the last event that irqtrace got was #69, and that was a 'hardirqs on'
> > in syscall_exit_work. After that we did a 'hardirqs off' without
> > properly tracking that via irqtrace. Next time we got an irqtrace event
> > (event 70) the assert caught up with us and turned off lockdep and
> > backed out of that function. This was in:
> >
> > > [ 116.754957] [<c043c3e5>] check_flags+0x95/0x143
> > > [ 116.754967] [<c043f158>] lock_acquire+0x29/0x82
> > > [ 116.754977] [<c06313a7>] _spin_lock+0x35/0x42
> > > [ 116.754990] [<c044894a>] refrigerator+0x14/0xc6
> > > [ 116.755002] [<c042d4b3>] get_signal_to_deliver+0x33/0x397
> > > [ 116.755016] [<c0403597>] do_notify_resume+0x94/0x6ed
> > > [ 116.755029] [<c0404099>] work_notifysig+0x13/0x1a
> >
> > isnt the refrigerator() suspend related? Perhaps suspend disables irqs
> > somewhere that we forgot to track?
>
> refrigerator is suspend related, but I do not think it does any
> interrupt magic. We do magic later in hibernation process.
>
> This is in kernel/power/process.c, we have spinlock_irqsave there, but
> that's pretty much it AFAICT.

That's correct. We don't manipulate IRQs directly in the freezer.

Greetings,
Rafael


--
"Premature optimization is the root of all evil." - Donald Knuth
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/