Re: 2.6.21-rc5-mm3
From: Rafael J. Wysocki
Date: Sun Apr 01 2007 - 16:53:07 EST
On Sunday, 1 April 2007 22:39, Rafael J. Wysocki wrote:
> On Sunday, 1 April 2007 21:03, Andrew Morton wrote:
> > On Sun, 01 Apr 2007 18:00:12 +0200 Michal Piotrowski <michal.k.k.piotrowski@xxxxxxxxx> wrote:
> >
> > > Andrew Morton napisaÅ(a):
> > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc5/2.6.21-rc5-mm3/
> > > >
> > >
> > > BUG: at /mnt/md0/devel/linux-mm/arch/i386/kernel/smp.c:571 native_smp_call_function_mask()
> > > [<c01051a1>] dump_trace+0x63/0x1eb
> > > [<c0105343>] show_trace_log_lvl+0x1a/0x30
> > > [<c0105f8a>] show_trace+0x12/0x14
> > > [<c0106027>] dump_stack+0x16/0x18
> > > [<c0113a92>] native_smp_call_function_mask+0x57/0x14b
> > > [<c0113c9b>] smp_call_function+0x1e/0x22
> > > [<c0129a60>] on_each_cpu+0x2a/0x73
> > > [<c013a12d>] clock_was_set+0x1b/0x1d
> > > [<c013b99d>] timekeeping_resume+0xb5/0xbb
> > > [<c027af35>] __sysdev_resume+0x17/0x5d
> > > [<c027b2aa>] sysdev_resume+0x19/0x4b
> > > [<c027fd12>] device_power_up+0xb/0x12
> > > [<c014f30b>] swsusp_suspend+0x55/0x63
> > > [<c014fad0>] pm_suspend_disk+0x163/0x28f
> > > [<c014e7be>] enter_state+0x54/0x1d5
> > > [<c014e9c5>] state_store+0x86/0x9c
> > > [<c01bfe47>] subsys_attr_store+0x23/0x2b
> > > [<c01bff89>] sysfs_write_file+0xc1/0xe9
> > > [<c0186485>] vfs_write+0xd1/0x15a
> > > [<c0186ab7>] sys_write+0x3d/0x72
> > > [<c010424c>] syscall_call+0x7/0xb
> > > [<b7f9b410>] 0xb7f9b410
> >
> > We're calling smp_call_function() with local interrupts disabled, which is
> > deadlockable.
> >
> > This, I expect, is because swsusp_suspend() optimistically tries to run
> > everything with local interrupts disabled.
>
> Well, not everything, but device_power_down()/device_power_up() which only
> handle sysdevs.
>
> > I don't know why this has suddenly started happening -
> > timekeeping_resume()->clock_was_set()->on_each_cpu() has been there for a
> > while. Doesn't mainline do the same thing?
>
> Yes, and it has always done it. It even is documented in
> Documentation/power/devices.txt:System Devices . ;-)
>
> > Not sure what to do about this. The best fix would be to teach swsusp to
> > not be so optmistic: resume functions are called with local irqs _enabled_
> > - that's part of their call environment. swsusp tries to call them with
> > local irqs disabled and bad things happen.
>
> I think timekeeping_resume() shouldn't call smp_call_function() ...
... which even is unnecessary, because sysdev_resume() runs on _one_ CPU.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/