Re: [PATCH] Page writeback broken after resume: wb_timer lost

From: Mark Lord
Date: Mon Jun 19 2006 - 11:40:14 EST


Johannes Stezenbach wrote:
On Sat, May 20, 2006, Andrew Morton wrote:
From: Andrew Morton <akpm@xxxxxxxx>

pdflush is carefully designed to ensure that all wakeups have some
corresponding work to do - if a woken-up pdflush thread discovers that it
hasn't been given any work to do then this is considered an error.

That all broke when swsusp came along - because a timer-delivered wakeup to a
frozen pdflush thread will just get lost. This causes the pdflush thread to
get lost as well: the writeback timer is supposed to be re-armed by pdflush in
process context, but pdflush doesn't execute the callout which does this.

Fix that up by ignoring the return value from try_to_freeze(): jsut proceed,
see if we have any work pending and only go back to sleep if that is not the
case.


Signed-off-by: Andrew Morton <akpm@xxxxxxxx>


I've tested this patch for about a week now, by applying it to
the 2.6.17-rc3 kernel on my laptop, which I've been using
for more than a month now. This patch seems to cure the
mysterious symptoms reported in February:

http://lkml.org/lkml/2006/2/6/167
http://lkml.org/lkml/2006/2/6/170
http://lkml.org/lkml/2006/2/13/424
etc.

Actually I didn't remember to check "Dirty:" in /proc/meminfo,
but when I "sync"ed at the end of my workday, just prior to
swsupending it, sync returned immediately. with unpatched
2.6.17-rc3, sync would take half a minute. Maybe Mark can give
this patch a spin to check if it cures his problem, too.
(I still use vmware, so vmware was not the culprit.)

I just gave it a try here. With or without a suspend/resume cycle after boot,
the "sync" time is much quicker. But the Dirty count in /proc/meminfo
still shows very huge (eg. 600MB) values that never really get smaller
until I type "sync". But that subsequent "sync" only takes a couple
of seconds now, rather than 10-20 seconds like before.

Dunno what that all means -- I'm still keeping my little daemon around
to do periodic "sync" calls for safety.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/