On Mon, 23 Oct 2000, Linus Torvalds wrote:
>
>
> On Mon, 23 Oct 2000, Alexander Viro wrote:
> >
> > Oh, crap... Who introduced ->i_mmap_shared/->i_mmap separation and what
> > analysis had been done? Petr, can you reproduce the problem on -test7?
>
> I don't think that is it - that code looks very straightforward (and is
> needed on some silly architectures that cannot easily otherwise see if
> they need to be coherent wrt user space - mainly sparc and virtual
> caches).
OK, I see where the race can happen. Yes, vmtruncate() tries to kill the
mappings. Right. However, it does that _after_ truncate_inode_pages().
And there is a window when the only lock we are holding is ->i_sem.
sync_pte in that window ==> we are fucked.
So the question being: WTF do we postpone zapping the page tables until
after the truncate_inode_pages()? The following rules might make life
simpler, AFAICS:
* as soon as ->i_size is set, no new pagetable references to
off-limits pages can appear.
* as soon as we are don with vmtruncate_list() there is no
pagetable references.
* truncate_inode_pages() never has to deal with pages refered from
pagetables.
* ->i_size can't increase until we return from vmtruncate().
It's not the only problem, but I would feel _much_ safer if pagefault
wouldn't rely on pagecache miss. Actually... Hey. Why don't we do the
insertion into page tables _within_ ->nopage()? Look: let's take the tail
of do_no_page() into helper function and just call it from the end of
every bloody ->nopage() out there. It _is_ easy: we have only 9 instances
in the tree not counting filemap_nopage(). Moreover,
do_anonymous_page() will become symmetrical to the rest of the crowd.
Comments?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Tue Oct 31 2000 - 21:00:13 EST