Re: [PATCH] proc: pagemap: Hold mmap_sem during page walk

From: Linus Torvalds
Date: Wed Mar 31 2010 - 13:59:15 EST




On Wed, 31 Mar 2010, San Mehat wrote:
>
> If the mmap_sem is not held while we walk_page_range(), then
> it is possible for find_vma() to race with a remove_vma_list()
> caused by do_munmap() (or others).

I think you've found a bug, but I also look at that code and say "that's
just totally insane".

Why does it do that initial "get_user_pages()" at all? It never _uses_
that 'pages' array except to mark the pages dirty, but that's insane,
since as far as I can see the way it actually dirties the pages in
question is by doing a regular "put_user(pfn, pm->out);". And that will
dirty the pages in hardware (or put_user).

Also, I get the feeling that the _reason_ it is not doing that down_read()
is that it would dead-lock the whole system, exactly on that "put_user()",
if somebody else did a down_write() in another thread. In that case you
have:

thread#1 thread#2
-------- --------

down_read()
...
down_write() - blocks
...
put_user();
.. page fault ..
down_read(); **DEADLOCK **


because our down_read() tries to be fair to the down_write().

So I think your patch would just create _different_ trouble.

I get the _feeling_ that the whole point of that 'pages' array was to not
do that put_user() at all, but write to the physical pages through that
array. But the code looks totally buggy.

I would seriously suggest that we consider removing the 'pagemap'
interface. The way that code looks, it's just broken.

Matt - give me a reason (which includes either a patch to fix this sh*t up
or telling me why I'm wrong, but _also_ includes a real independent reason
to keep that thing around regardless) to not remove it all.

The whole notion seems to be utterly misdesigned.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/