Re: Q: generic_file_write sets PG_locked???

Eric W. Biederman (ebiederm+eric@ccr.net)
24 Apr 1999 13:15:34 -0500


>>>>> "LT" == Linus Torvalds <torvalds@transmeta.com> writes:

LT> On 24 Apr 1999, Eric W. Biederman wrote:
>>
>> Basically I contend that using PageLocked as a "data coherency lock"
>> is broken, at least when using the generic code.

LT> It _isn't_ a data coherency lock at all:
O.k. It was my impresssion that was how NFS was using it though.

LT> it's really a data structure
LT> coherency thing. Think of it as serialization protecting the internal
LT> data structures, and protecting the filesystem layer from ever getting
LT> confused about the order of reads and writes.

LT> For example, a page cache entry always has to start out in a locked state,
LT> so that when a write occurs, we can fill in the page without any reader
LT> ever seeing the pre-write contents. If anybody was ever able to see those
LT> contents, we'd have serious security holes. That's why the write logic has
LT> to get the page lock early: not because of any real requirements for data
LT> coherency, but for a more subtle reason.

Bonk self on head.
Thanks. This was the was the answer looking for, when I started this thread.

When I look at if that I angle I still think we have real issues.
Basically PageUptodate tells if we can let anybody see the page.
In this context PageLocked keeps 2 processes from doing an initial
fill of the page. In particular we don't want someone calling
readpage and reading data off of the backing store before we can set
PageUptodate after copy_from_user.

So I agree that setting PageLock is necessary if you are skipping
calling readpage.

However the current code has a nice little race.
Process A puts in a write request into the lands in the middle of the page.
The page is initially uncached, so and since we didn't write
PAGE_SIZE bytes we don't set PageUptodate.
We delay the write.
Process B reads from the same page. So we call readpage (The page
isn't uptodate). And overwrite what was in the page.
Process C the kernel writer thread wakes up and writes the page
to the backing store. But we have lost what Process A put there :(

Since it is proveable that A came before B. (especially with a 30s
write delay), and A didn't read what B wrote we have one problem.

Since we never write what B wrote to the backing store,
we have another problem.

So I don't see how we can safely avoid a call to readpage if we are
writing data to a page, (and we aren't writing enough to set
PageUptodate).

In practical terms this only affects NFS currently as that is the only
fs that attempts uses generic_file_write and does writeback caching,
keeping the lock PageLocked across updatepage keeps smbfs safe.

And this still leaves it as an open question why keep the page locked
around the call to updatepage. The only real function I can see is
for a data coherency lock. . .

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/