[PATCH 3/3] Memory management livelock
From: Mikulas Patocka
Date: Wed Sep 24 2008 - 14:53:53 EST
Fix violation of sync()/fsync() semantics. Previous code walked up to
mapping->nrpages * 2 pages. Because pages could be created while
__filemap_fdatawrite_range was in progress, it could lead to a misbehavior.
Example: there are two pages in address space with indices 4, 5. Both are dirty.
Someone calls __filemap_fdatawrite_range, it sets .nr_to_write = 4.
Meanwhile, some other process creates dirty pages 0, 1, 2, 3.
__filemap_fdatawrite_range writes pages 0, 1, 2, 3, finds out that it reached
the limit and exits.
Result: pages that were dirty before __filemap_fdatawrite_range was invoked were
not written.
With starvation protection from the previous patch, this mapping->nrpages * 2
logic is no longer needed.
Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>
---
mm/filemap.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
Index: linux-2.6.27-rc7-devel/mm/filemap.c
===================================================================
--- linux-2.6.27-rc7-devel.orig/mm/filemap.c 2008-09-24 14:47:01.000000000 +0200
+++ linux-2.6.27-rc7-devel/mm/filemap.c 2008-09-24 15:01:23.000000000 +0200
@@ -202,6 +202,11 @@ static int sync_page_killable(void *word
* opposed to a regular memory cleansing writeback. The difference between
* these two operations is that if a dirty page/buffer is encountered, it must
* be waited upon, and not just skipped over.
+ *
+ * Because new pages dirty can be created while this is executing, that
+ * mapping->nrpages * 2 condition is unsafe. If we are doing data integrity
+ * write, we must write all the pages. AS_STARVATION bit will eventually prevent
+ * creating more dirty pages to avoid starvation.
*/
int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
loff_t end, int sync_mode)
@@ -209,7 +214,7 @@ int __filemap_fdatawrite_range(struct ad
int ret;
struct writeback_control wbc = {
.sync_mode = sync_mode,
- .nr_to_write = mapping->nrpages * 2,
+ .nr_to_write = sync_mode == WB_SYNC_NONE ? mapping->nrpages * 2 : LONG_MAX,
.range_start = start,
.range_end = end,
};
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/