Re: page_mkwrite caller is racy?

From: Nick Piggin
Date: Mon Jan 29 2007 - 20:14:57 EST


Hugh Dickins wrote:
On Mon, 29 Jan 2007, Nick Piggin wrote:

Moving page_cache_release(old_page) to below the next statement
will fix that problem.


Yes. I'm reluctant to steal your credit, but also reluctant to go
back and forth too much over this: please insert your Signed-off-by
_before_ mine in the patch below (substituting your own comment if
you prefer) and send it Andrew.

Not a priority for 2.6.20 or -stable: aside from the unlikelihood,
we don't seem to have any page_mkwrite users yet, as you point out.

Agreed. Thanks for doing the patch.

But it is sad that this thing got merged without any callers to even
know how it is intended to work.


I'm rather to blame for that: I pushed Peter to rearranging his work
on top of what David had, since they were dabbling in related issues,
and we'd already solved a number of them in relation to page_mkwrite;
so then when dirty tracking was wanted in, page_mkwrite came with it.

Well its not a big problem -- I knew there were several people lined
up who wanted it. XFS is another one IIRC.

Must it be able to sleep?


Not as David was using it: that was something I felt strongly it
should be allowd to do. For example, in order to allocate backing
store for the mmap'ed page to be written (that need has been talked
about off and on for years).

Fine, and Mark and Anton confirm it (cc'ed, thanks guys).

This is another discussion, but do we want the page locked here? Or
are the filesystems happy to exclude truncate themselves?


After do_wp_page has tested page_mkwrite, it must release old_page after
acquiring page table lock, not before: at some stage that ordering got
reversed, leaving a (very unlikely) window in which old_page might be
truncated, freed, and reused in the same position.

Andrew please apply.

Signed-off-by: Nick Piggin <npiggin@xxxxxxx>

Signed-off-by: Hugh Dickins <hugh@xxxxxxxxxxx>
---

mm/memory.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

--- 2.6.20-rc6/mm/memory.c 2007-01-25 08:25:27.000000000 +0000
+++ linux/mm/memory.c 2007-01-29 15:35:56.000000000 +0000
@@ -1531,8 +1531,6 @@ static int do_wp_page(struct mm_struct *
if (vma->vm_ops->page_mkwrite(vma, old_page) < 0)
goto unwritable_page;
- page_cache_release(old_page);
-
/*
* Since we dropped the lock we need to revalidate
* the PTE as someone else may have changed it. If
@@ -1541,6 +1539,7 @@ static int do_wp_page(struct mm_struct *
*/
page_table = pte_offset_map_lock(mm, pmd, address,
&ptl);
+ page_cache_release(old_page);
if (!pte_same(*page_table, orig_pte))
goto unlock;
}



--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/