Re: 2.6.21-rc7-mm1 + sysfs-oops-workaround.patch -- softwaresuspend failed (1 tasks refusing to freeze)

From: Andrew Morton
Date: Wed Apr 25 2007 - 03:09:09 EST


On Tue, 24 Apr 2007 23:41:32 -0700 "Miles Lane" <miles.lane@xxxxxxxxx> wrote:

> On 4/24/07, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > On Tue, 24 Apr 2007 22:49:48 -0700 "Miles Lane" <miles.lane@xxxxxxxxx> wrote:
> >
> > > On 4/24/07, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > > On Tue, 24 Apr 2007 22:27:44 -0700 "Miles Lane" <miles.lane@xxxxxxxxx> wrote:
> > > >
> > > > > [ 1251.506964] PM: Preparing system for mem sleep
> > > > > [ 1251.514790] Stopping tasks ...
> > > > > [ 1271.456065] Stopping user space processes timed out after 20
> > > > > seconds (1 tasks refusing to freeze):
> > > > > [ 1271.456243] multiload-apple
> > > > > [ 1271.456291] Restarting tasks ... done.
> > > > >
> > > > > This isn't happening under earlier builds I've tested. How can I debug this?
> > > > >
> > > >
> > > > hm, that's multiload-applet, some gnome thing.
> > > >
> > > > sysrq-T, perhaps? Perhaps the process is sleeping in the kernel somewhere.
> > >
> > > Should I wait for the next patch from Tejun before retesting? Perhaps
> > > this suspend problem is a side effect of the locking problem he
> > > mentioned.
> >
> > It's unlikely to be related to Tejun's sysfs changes.
> >
> I tried to reproduce this, but this time when I tried to suspend, the
> machine just hung with a message showing saying the system was
> suspending. When I touched my Synaptics mousepad after about ten
> seconds of waiting, the system suddenly suspended.

Fun. Could be one of Greg's trees, could be the swsusp patches, could be
the freezer changes, could be USB, could be the input layer, could be
anything. I don't see how we can merge anything at all into 2.6.22 at
present. The only thing to be said for doing that is that we'd increase
our pool of bisection-searchers. argh.

Rafael, Oleg: do we have a way of exercising the freezer from userspace?
Just do a freeze/unfreeze? We should.

Also, can we get better diagnostics when the freeze fails? Say, go the
equivalent of a sysrq-T? It could be that multiload-applet is waiting on
activity from an already-frozen thread or something.

> Upon resuming, I
> checked dmesg and found the time seems to be totally out of whack:

So sched_clock() went bad. That's another tree or three we can't merge.

Is the system time also wrong?

> [ 1334.589074] pci 0000:00:1f.6: LATE suspend
> [ 1334.589080] Intel ICH 0000:00:1f.5: LATE suspend
> [ 1334.589085] pci 0000:00:1f.3: LATE suspend
> [ 1334.589091] pci 0000:00:1f.0: LATE suspend
> [ 1334.589096] pci 0000:00:1e.0: LATE suspend, may wakeup
> [ 1334.589102] pci 0000:00:02.1: LATE suspend
> [ 1334.589108] pci 0000:00:02.0: LATE suspend
> [ 1334.589113] pci 0000:00:00.3: LATE suspend
> [ 1334.589118] pci 0000:00:00.1: LATE suspend
> [ 1334.589124] agpgart-intel 0000:00:00.0: LATE suspend
> [ 1334.589733] hwsleep-0323 [03] enter_sleep_state : Entering
> sleep state [S3]
> [18014527.889728] Intel machine check architecture supported.
> [18014527.889749] Intel machine check reporting enabled on CPU#0.
> [18014527.889783] Back to C!
> [18014527.889783] agpgart-intel 0000:00:00.0: EARLY resume
> [18014527.889783] PCI: Calling quirk c01c94b2 for 0000:00:00.0
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/