Re: Status of buffered write path (deadlock fixes)

From: Nick Piggin
Date: Tue Dec 12 2006 - 23:04:39 EST


Trond Myklebust wrote:
On Wed, 2006-12-13 at 12:56 +1100, Nick Piggin wrote:

Note that these pages should be *really* rare. Definitely even for normal
filesystems I think RMW would use too much bandwidth if it were required
for any significant number of writes.


If file "foo" exists on the server, and contains data, then something
like

fd = open("foo", O_WRONLY);
write(fd, "1", 1);

should never need to trigger a read. That's a fairly common workload
when you think about it (happens all the time in apps that do random
write).

Right. What I'm currently looking at doing in that case is two copies,
first into a temporary buffer. Unfortunate, but we'll see what the
performance looks like.

I don't want to mandate anything just yet, so I'm just going through our
options. The first two options (remove, and RMW) are probably trickier
than they need to be, given the 3rd option available (temp buffer). Given
your input, I'm increasingly thinking that the best course of action would
be to fix this with the temp buffer and look at improving that later if it
causes a noticable slowdown.


What is the generic problem you are trying to resolve? I saw something
fly by about a reader filling the !uptodate page while the writer is
updating it: how is that going to happen if the writer has the page
locked?

The problem is that you can't take a pagefault while holding the page
lock. You can deadlock against another page, the same page, or the
mmap_sem.

AFAIK the only thing that can modify the page if it is locked (aside
from the process that has locked it) is a process that has the page
mmapped(). However mmapped pages are always uptodate, right?

That's right (modulo the pagefault vs invalidate race bug).

But we need to unlock the destination page in order to be able to take
a pagefault to bring the source user memory uptodate. If the page is
not uptodate, then a read might see uninitialised data.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/