Re: sync-Regression in 2.6.28.2?

From: Nick Piggin
Date: Wed Feb 04 2009 - 01:18:07 EST


On Wednesday 04 February 2009 06:51:16 Federico Cuello wrote:
> Nick Piggin wrote:
> > On Wednesday 28 January 2009 08:09:10 Federico Cuello wrote:
> >> Ralf Hildebrandt escribiÃ:
> >>> I recently installed 2.6.28.2 on our postfix/dovecot-based
> >>> mailboxserver. Previously, 2.6.28 and 2.6.28.1 have been running there
> >>> without a hitch.
> >>>
> >>> Now with 2.6.28.2S I had two major lockups: All writes to the users'
> >>> Maildirs (on ext4) would stall, the load would rise, "sync" would never
> >>> return.
> >>>
> >>> I had to "reboot -f -n" to get the machine back. All hanging processes
> >>> were unkillable, even with kill -9.
> >>> [...]
> >>
> >> The same is happening to me, but I have some logs taken with sysrq.
[...]
> >> 0 2 99028 46536 26368 1569400 0 0 4 0 1407 387 0
> >> 0 0 100
> >>
> >> Notice the 100% iowait.
> >>
> >> I also managed to reproduce it doing a rsync from one partition to a USB
> >> drive. After the lockup I can't read any file from the source partition,
> >> but the other partitions can be accessed normally.
> >
> > Hm, thanks for reporting, can you guys get a sysrq+W trace when the
> > system reaches this state?
>
> Here it is, if you need something else please ask:

Thanks, could you reply-to-all when replying to retain ccs please?

Common theme is ext4, which uses no_nrwrite_index_update, and I introduced
a bug in there which could possibly cause ext4 to go into a loop...

Would it be possible if you can test the following patch?

Thanks,
Nick

---

Commit 05fe478dd04e02fa230c305ab9b5616669821dd3 introduced some
@wbc->nr_to_write breakage. Here is the change from the commit:

>--- a/mm/page-writeback.c
>+++ b/mm/page-writeback.c
>@@ -963,8 +963,10 @@ retry:
> }
> }
>
> if (--nr_to_write <= 0)
> done = 1;
> if (wbc->sync_mode == WB_SYNC_NONE) {
> if (--wbc->nr_to_write <= 0)
> done = 1;
> }
> if (wbc->nonblocking && bdi_write_congested(bdi)) {
> wbc->encountered_congestion = 1;
> done = 1
> }

It makes the following changes:
1. Decrement wbc->nr_to_write instead of nr_to_write
2. Decrement wbc->nr_to_write _only_ if wbc->sync_mode == WB_SYNC_NONE
3. If synced nr_to_write pages, stop only if if wbc->sync_mode ==
WB_SYNC_NONE, otherwise keep going.

However, according to the commit message, the intention was to
only make change 3. Change 1 is a bug. Change 2 does not seem to be
necessary, and it breaks UBIFS expectations, so if needed, it
should be done separately later. And change 2 does not seem to
be documented in the commit message.

This patch does the following:
1. Undo changes 1 and 2
2. Add a comment explaining change 3 (it very useful to have comments in
_code_, not only in the commit).

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@xxxxxxxxx>
Acked-by: Nick Piggin <npiggin@xxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---
mm/page-writeback.c | 21 +++++++++++++++------
1 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index b493db7..dc32dae 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1051,13 +1051,22 @@ continue_unlock:
}
}

- if (wbc->sync_mode == WB_SYNC_NONE) {
- wbc->nr_to_write--;
- if (wbc->nr_to_write <= 0) {
- done = 1;
- break;
- }
+ if (nr_to_write > 0)
+ nr_to_write--;
+ else if (wbc->sync_mode == WB_SYNC_NONE) {
+ /*
+ * We stop writing back only if we are not
+ * doing integrity sync. In case of integrity
+ * sync we have to keep going because someone
+ * may be concurrently dirtying pages, and we
+ * might have synced a lot of newly appeared
+ * dirty pages, but have not synced all of the
+ * old dirty pages.
+ */
+ done = 1;
+ break;
}
+
if (wbc->nonblocking && bdi_write_congested(bdi)) {
wbc->encountered_congestion = 1;
done = 1;

¢éì®&Þ~º&¶¬–+-±éÝ¥Šw®žË±Êâmébžìdz¹Þ)í…æèw*jg¬±¨¶‰šŽŠÝj/êäz¹ÞŠà2ŠÞ¨è­Ú&¢)ß«a¶Úþø®G«éh®æj:+v‰¨Šwè†Ù>Wš±êÞiÛaxPjØm¶Ÿÿà -»+ƒùdš_