Re: [RFC][PATCH] netfs, afs, ceph: Use folios
From: Matthew Wilcox
Date: Wed Aug 11 2021 - 09:56:21 EST
On Wed, Aug 11, 2021 at 02:07:51PM +0100, David Howells wrote:
> Convert the netfs helper library and the afs filesystem to use folios.
> NOTE: This patch will also need to alter the ceph filesystem, but as that's
> not been done that yet, ceph will fail to build.
> The patch makes two alterations to the mm headers:
> (1) Fix a bug in readahead_folio() where a NULL return from
> __readahead_folio() will cause folio_put() to oops.
I'll fold that in.
> (2) Add folio_change_private() to change the private data on the folio
> without adjusting the page refcount or changing the flag. This
> assumes folio_attach_private() was already called.
> (*) Should I be using page_mapping() or page_file_mapping()?
Depends if you can have a swapfile on your filesystem. I'd like to
get rid of this and only use the directIO path for swap, but that's a
> (*) Can page_endio() be split into two separate functions, one for read
> and one for write? If seems a waste of time to conditionally switch
> between two different branches.
So you'd like a folio_end_write() and folio_end_read()?
> (*) Is there a better way to implement afs_kill_pages() and
> afs_redirty_pages()? I was previously using find_get_pages_contig()
> into a pagevec, but that doesn't look like it'll work with folios, so
> I'm now calling filemap_get_folio() a lot more - not that it matters
> so much, as these are failure paths.
I always disliked the _contig variants. Block filesystems tend to
follow the pattern
while network filesystems tend to use the pattern
it'd be nice to follow the same pattern for both. Would reduce the
amount of duplicated infrastructure.
> Also, should these be moved into generic code?
I'd have to figure out what they do to answer this question.
> (*) Can ->page_mkwrite() see which subpage of a folio got hit?
It already does -- you're passed a page, not a folio. Are you trying
to optimise by only marking part of a folio as dirty? If so, that's a
bad idea because we're going to want to, eg, map 64KB chunks of a folio
with a single TLB entry on ARM, so you'll only get one notification for
> (*) __filemap_get_folio() should be used instead of
> grab_cache_page_write_begin()? What should be done if xa_is_value()
> returns true on the value returned by that?
If you don't pass FGP_ENTRY, it won't return you an xa_is_value() ...