Re: [patch] Fix file-corrupting bug, kernels 2.5.41 to 2.5.44

From: Jens Axboe (axboe@suse.de)
Date: Mon Oct 28 2002 - 06:04:27 EST


On Sun, Oct 27 2002, Andrew Morton wrote:
>
>
> This patch fixes a filesystem corrupting bug, present in 2.5.41 through
> 2.5.44. It can cause ext2 indirect blocks to not be written out. A
> fsck will fix it up.
>
> Under heavy memory pressure a PF_MEMALLOC task attemtps to write out a
> blockdev page whose buffers are already under writeback and which were
> dirtied while under writeback.

Nasty, it's odd how noone else seems to have noticed this?

> The writepage call returns -EAGAIN but because the caller is
> PF_MEMALLOC, the page was not being marked dirty again.
>
> The page sits on mapping->clean_pages for ever and it not written out.
>
> The fix is to mark that page dirty again for all callers, regardless of
> PF_MEMALLOC state.

I can confirm that this fixes my 'loosing data under vm pressure' bug,
both for O_DIRECT case and sgio. It passed 1 iteration of both tests, it
would not even get past the 10% mark before. Thanks!

BTW, 2.5.44-mm6 showed some funnies and corrupted data in other
interesting ways. I'm hesitant to report this as a bug right now, as it
may just have been that the target fs had not been fsck'ed after being
run under one of the buggy kernels. But it did crash in the end, dumping
lots of hot/cold warnings. The above verification was run under 2.5.44 +
sgio patches + your standalone __set_page_dirty() fix.

-- 
Jens Axboe

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Oct 31 2002 - 22:00:36 EST