ok, let me try again.
page->count
free 0
in pagecache, 1
clean
write 2 <----- A
write done, 1 <----- B
still in
pagecache
write 2
write done 1
ok, so we've done two writes to this page and it's still in the pagecache.
however, the fact that it is dirty is not recorded anywhere in struct page
itself. it's implicitly stored in the state of page->buffers. now if you
have a filesystem like nfs which doesn't want to use page->buffers, then
you have no way of recording this fact.
you could, at point A, bump up refcount by one more, making it 2 after
the write is done. this will make it safe wrt shrink_mmap, but what about
future writes ? should they bump up refcount by 2 again ? there's no getting
away from the fact that dirtiness is a state and can't be represented by
a count.
what nfs does is implement its own writeback mechanism. on the first write to
a page, it is added to an internal writeback queue and refcount is bumped up
so that it is 2 after the writepage returns. this insulates the page from
shrink_mmap. future writes to the same page can <handwave> probably
find the page in the internal queue itself, so page->count is not further
incremented. </handwave>
so we can do it, but nfs and nfs-like filesystems can't use the generic_*
mechanisms. it's ok if there's just one such fs, but if we have three or
four, then it begins to make sense to shift such functionality to the generic
code. it would have little or no impact on other filesystems and would
probably clean up the whole vmscan stuff anyway.
here's a quote from fs/nfs/write.c
* FIXME: Interaction with the vmscan routines is not optimal yet.
* Either vmscan must be made nfs-savvy, or we need a different page
* reclaim concept that supports something like FS-independent
* buffer_heads with a b_ops-> field.
ganesh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/