[PATCH 3/3] tmpfs: fix spurious ENOSPC when racing with unswap

From: Hugh Dickins
Date: Sun May 08 2011 - 15:45:06 EST


Testing the shmem_swaplist replacements for igrab() revealed another bug:
writes to /dev/loop0 on a tmpfs file which fills its filesystem were
sometimes failing with "Buffer I/O error"s.

These came from ENOSPC failures of shmem_getpage(), when racing with
swapoff: the same could happen when racing with another shmem_getpage(),
pulling the page in from swap in between our find_lock_page() and our
taking the info->lock (though not in the single-threaded loop case).

This is unacceptable, and surprising that I've not noticed it before:
it dates back many years, but (presumably) was made a lot easier to
reproduce in 2.6.36, which sited a page preallocation in the race window.

Fix it by rechecking the page cache before settling on an ENOSPC error.

Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxx>
Cc: stable@xxxxxxxxxx
---

mm/shmem.c | 32 ++++++++++++++++++++++----------
1 file changed, 22 insertions(+), 10 deletions(-)

--- tmpfs2/mm/shmem.c 2011-05-07 17:39:00.656959448 -0700
+++ tmpfs3/mm/shmem.c 2011-05-07 17:40:00.665256101 -0700
@@ -1407,20 +1407,14 @@ repeat:
if (sbinfo->max_blocks) {
if (percpu_counter_compare(&sbinfo->used_blocks,
sbinfo->max_blocks) >= 0 ||
- shmem_acct_block(info->flags)) {
- spin_unlock(&info->lock);
- error = -ENOSPC;
- goto failed;
- }
+ shmem_acct_block(info->flags))
+ goto nospace;
percpu_counter_inc(&sbinfo->used_blocks);
spin_lock(&inode->i_lock);
inode->i_blocks += BLOCKS_PER_PAGE;
spin_unlock(&inode->i_lock);
- } else if (shmem_acct_block(info->flags)) {
- spin_unlock(&info->lock);
- error = -ENOSPC;
- goto failed;
- }
+ } else if (shmem_acct_block(info->flags))
+ goto nospace;

if (!filepage) {
int ret;
@@ -1500,6 +1494,24 @@ done:
error = 0;
goto out;

+nospace:
+ /*
+ * Perhaps the page was brought in from swap between find_lock_page
+ * and taking info->lock? We allow for that at add_to_page_cache_lru,
+ * but must also avoid reporting a spurious ENOSPC while working on a
+ * full tmpfs. (When filepage has been passed in to shmem_getpage, it
+ * is already in page cache, which prevents this race from occurring.)
+ */
+ if (!filepage) {
+ struct page *page = find_get_page(mapping, idx);
+ if (page) {
+ spin_unlock(&info->lock);
+ page_cache_release(page);
+ goto repeat;
+ }
+ }
+ spin_unlock(&info->lock);
+ error = -ENOSPC;
failed:
if (*pagep != filepage) {
unlock_page(filepage);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/