Re: SMP race in munmap/ftruncate paths

From: Philip R. Auld (prauld@egenera.com)
Date: Fri Dec 21 2001 - 10:28:26 EST


Hi again,

"Philip R. Auld" wrote:
>
> Hi,
>
> It looks like there is an SMP locking problem and race
> the munmap and truncate paths. I've hit this problem using a tree
> based on the redhat 2.4.7-10 kernel, which in turn is 2.4.7-ac3 plus
> some other patches. It looks like 2.4.16 still has these issues although
> this code has changed a bit.

[stuff snipped]

        As a follow-up to this I noticed that the lock hierarchy in this
area is broken as well. This problem _has_ been fixed in later kernels
(at least as of 2.4.16). I'm just offering this up for completeness and in
case someone else is not it a position to switch code bases at will.

truncate_list_pages is called with the mapping->page-lock held and
since it violates the ordering it does a try_lock on the pagecache_lock.

mm/filemap.c
375 pg_lock = PAGECACHE_LOCK(page);
376
377 if (!spin_trylock(pg_lock)) {
378 return 1;
379 }
380

But since the the caller (truncate_inode_pages) doesn't release the
mapping->pagelock between calls the try_lock is useless. We can still
dead-lock. The patch I sent yesterday to bandaid (I won't say fix because
I don't think it solves the real problem...) the race I was seeing
exacerbates this. One solution is to move the mapping lock acquisition
into truncate_list_pages:

Index: mm/filemap.c
===================================================================
RCS file: /build/vault/linux2.4/mm/filemap.c,v
retrieving revision 1.5
diff -u -r1.5 filemap.c
--- mm/filemap.c 2001/10/19 03:48:00 1.5
+++ mm/filemap.c 2001/12/21 14:59:23
@@ -361,6 +366,7 @@
         struct list_head *curr;
         struct page * page;
 
+ spin_lock(&mapping->page_lock);
         curr = head->next;
         while (curr != head) {
                 unsigned long offset;
@@ -375,6 +381,7 @@
                         pg_lock = PAGECACHE_LOCK(page);
 
                         if (!spin_trylock(pg_lock)) {
+ spin_unlock(&mapping->page_lock);
                                 return 1;
                         }
 
@@ -402,10 +409,10 @@
                 }
                 curr = curr->next;
         }
+ spin_unlock(&mapping->page_lock);
         return 0;
 out_restart:
         page_cache_release(page);
- spin_lock(&mapping->page_lock);
         return 1;
 }
 
@@ -425,7 +432,6 @@
         unsigned partial = lstart & (PAGE_CACHE_SIZE - 1);
         int complete;
 
- spin_lock(&mapping->page_lock);
         do {
                 complete = 1;
                 while (truncate_list_pages(mapping,&mapping->clean_pages, start, &partial))
@@ -436,7 +442,6 @@
                         complete = 0;
         } while (!complete);
         /* Traversed all three lists without dropping the lock */
- spin_unlock(&mapping->page_lock);
 }
 
 /*

------------------------------------------------------
Philip R. Auld, Ph.D. Technical Staff
Egenera Corp. pauld@egenera.com
165 Forest St, Marlboro, MA 01752 (508)786-9444
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Dec 23 2001 - 21:00:24 EST