Re: [RFC][patch 0/2] mm: remove PageReserved
From: David Howells
Date: Wed Aug 10 2005 - 09:28:43 EST
Daniel Phillips <phillips@xxxxxxxx> wrote:
> > An extra page flag beyond PG_uptodate, PG_lock and PG_writeback is
> > required to make readpage through the cache non-synchronous.
Sorry, I meant to say "filesystem cache": FS-Cache/CacheFS.
> Interesting, have you got a pointer to a full explanation? Is this about aio?
No, it's nothing to do with AIO. This is to do with using local disk to cache
network filesystems and other relatively slow devices.
What happens is this:
(1) readpage() is issued against NFS (for example).
(2) NFS consults the local cache, and finds the page isn't available there.
(3) NFS reads the page from the server.
(4) NFS sets PG_fs_misc and tells the cache to store the page.
(5) NFS sets PG_uptodate and unlocks the page.
Some time later, the cache finishes writing the page to disk:
(6) The cache calls NFS to say that it's finished writing the page.
(7) NFS calls end_page_fs_misc() - which clears PG_fs_misc - to indicate to
any waiters that the page can now be written to.
Now: any PTEs set up to point to this page start life read-only. If they're
part of a shared-writable mapping, then the MMU will generate a WP fault when
someone attempts to write to the page through that mapping:
(a) do_wp_page() gets called.
(b) do_wp_page() sees that the page's host has registered an interest in
knowing that the page is becoming writable:
vm_operations_struct::page_mkwrite()
(c) do_wp_page() calls out to the filesystem.
(d) NFS sees the page is wanting to become writable and waits for the
PG_fs_misc flag to become cleared.
(e) NFS returns to the caller and things proceed as normal.
Doing this permits the cache state to be more predictable in the event of
power loss because we know that userspace won't have scribbled on this page
whilst the cache was trying to write it to disk.
David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/