[GIT PULL] fscache: I/O API modernisation and netfs helper library

From: David Howells
Date: Tue Feb 09 2021 - 11:11:49 EST


Hi Linus,

Can you pull this during the upcoming merge window? It provides a more
modern I/O API for fscache and moves some common pieces out of network
filesystems into a common helper library. This request only includes
modifications for afs and ceph.

Dave Wysochanski has a patch series for nfs. Normal nfs works fine and
passes various tests, but it turned out pnfs has a problem that wasn't
discovered until quite late - pnfs does splitting of requests itself and
sending them to various places, but it will need to cooperate more closely
with the netfs lib over this.

I've given Dominique Martinet a patch for 9p and Steve French a partial
patch for cifs, but neither of those is going to be ready for this merge
window.

The main features of this request are:

(1) Institution of a helper library for network filesystems. The first
phase of this handles ->readpage(), ->readahead() and part of
->write_begin() on behalf of the netfs, requiring the netfs to provide
a common vector to perform a read to some part of a file.

This allows handling of the following to be (at least partially) moved
out of all the network filesystems and consolidated in one place:

- changes in VM vectors (Matthew Wilcox's work)
- transparent huge page support
- shaping of reads
- readahead expansion
- fs alignment/granularity (ceph, pnfs)
- cache alignment/granularity
- slicing of reads
- rsize
- keeping multiple read in flight } Steve French would like
- multichannel distribution } but for the future
- multiserver distribution (ceph, pnfs)
- stitching together reads from the cache and reads from the net
- copying data read from the server into the cache
- retry/reissue handling
- fallback after cache failure
- short reads
- fscrypt data crypting (Jeff Layton is considering for the future)

(2) Adding an alternate cache I/O API for use with the netfs lib that
makes use of kiocbs in the cache to do direct I/O between the cache
files and the netfs pages.

This is intended to replace the current I/O API that calls the backing
fs readpage op and than snooping the wait queues for completion to
read and using vfs_write() to write. It wasn't possible to do
in-kernel DIO when I first wrote cachefiles - but using kiocbs makes
it a lot simpler and more robust (and it uses a lot less memory).

(3) Add an ITER_XARRAY iov_iter that allows I/O iteration to be done on an
xarray of pinned pages (such as inode->i_mapping->i_pages), thereby
avoiding the need to allocate a bvec array to represent this.

This is used to present a set of netfs pages to the cache to do DIO on
and is also used by afs to present netfs pages to sendmsg. It could
also be used by unencrypted cifs to pass the pages to the TCP socket
it uses (if it's doing TCP) and my patch for 9p (which isn't included
here) can make use of it too.

(4) Make afs use the above. It passes the same xfstests (and has the same
failures) as the unpatched afs client.

(5) Make ceph use the above (I've merged a branch from Jeff Layton for this).
This also passes xfstests.

David
---
The following changes since commit 9791581c049c10929e97098374dd1716a81fefcc:

Merge tag 'for-5.11-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux (2021-01-20 14:15:33 -0800)

are available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git tags/fscache-ioapi-20210203

for you to fetch changes up to 1df6bf2cc0fad1a5b2b32b7b0066b13175ad1ce4:

netfs: Fix kerneldoc on netfs_subreq_terminated() (2021-02-03 11:17:57 +0000)

----------------------------------------------------------------
fscache I/O API rework and netfs changes

----------------------------------------------------------------
David Howells (29):
iov_iter: Add ITER_XARRAY
vm: Add wait/unlock functions for PG_fscache
mm: Implement readahead_control pageset expansion
vfs: Export rw_verify_area() for use by cachefiles
netfs: Make a netfs helper module
netfs: Provide readahead and readpage netfs helpers
netfs: Add tracepoints
netfs: Gather stats
netfs: Add write_begin helper
netfs: Define an interface to talk to a cache
fscache, cachefiles: Add alternate API to use kiocb for read/write to cache
afs: Disable use of the fscache I/O routines
afs: Pass page into dirty region helpers to provide THP size
afs: Print the operation debug_id when logging an unexpected data version
afs: Move key to afs_read struct
afs: Don't truncate iter during data fetch
afs: Log remote unmarshalling errors
afs: Set up the iov_iter before calling afs_extract_data()
afs: Use ITER_XARRAY for writing
afs: Wait on PG_fscache before modifying/releasing a page
afs: Extract writeback extension into its own function
afs: Prepare for use of THPs
afs: Use the fs operation ops to handle FetchData completion
afs: Use new fscache read helper API
Merge branch 'fscache-netfs-lib' into fscache-next
Merge branch 'ceph-netfs-lib' of https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux into fscache-next
netfs: Fix various bits of error handling
afs: Fix error handling in afs_req_issue_op()
netfs: Fix kerneldoc on netfs_subreq_terminated()

Jeff Layton (7):
ceph: disable old fscache readpage handling
ceph: rework PageFsCache handling
ceph: fix fscache invalidation
ceph: convert readpage to fscache read helper
ceph: plug write_begin into read helper
ceph: convert ceph_readpages to ceph_readahead
ceph: fix an oops in error handling in ceph_netfs_issue_op

fs/Kconfig | 1 +
fs/Makefile | 1 +
fs/afs/Kconfig | 1 +
fs/afs/dir.c | 225 +++++---
fs/afs/file.c | 470 ++++-------------
fs/afs/fs_operation.c | 4 +-
fs/afs/fsclient.c | 108 ++--
fs/afs/inode.c | 7 +-
fs/afs/internal.h | 58 +-
fs/afs/rxrpc.c | 150 ++----
fs/afs/write.c | 610 ++++++++++++----------
fs/afs/yfsclient.c | 82 +--
fs/cachefiles/Makefile | 1 +
fs/cachefiles/interface.c | 5 +-
fs/cachefiles/internal.h | 9 +
fs/cachefiles/rdwr2.c | 412 +++++++++++++++
fs/ceph/Kconfig | 1 +
fs/ceph/addr.c | 535 ++++++++-----------
fs/ceph/cache.c | 125 -----
fs/ceph/cache.h | 101 +---
fs/ceph/caps.c | 10 +-
fs/ceph/inode.c | 1 +
fs/ceph/super.h | 1 +
fs/fscache/Kconfig | 1 +
fs/fscache/Makefile | 3 +-
fs/fscache/internal.h | 3 +
fs/fscache/page.c | 2 +-
fs/fscache/page2.c | 117 +++++
fs/fscache/stats.c | 1 +
fs/internal.h | 5 -
fs/netfs/Kconfig | 23 +
fs/netfs/Makefile | 5 +
fs/netfs/internal.h | 97 ++++
fs/netfs/read_helper.c | 1161 +++++++++++++++++++++++++++++++++++++++++
fs/netfs/stats.c | 59 +++
fs/read_write.c | 1 +
include/linux/fs.h | 1 +
include/linux/fscache-cache.h | 4 +
include/linux/fscache.h | 40 +-
include/linux/netfs.h | 167 ++++++
include/linux/pagemap.h | 16 +
include/linux/uio.h | 11 +
include/net/af_rxrpc.h | 2 +-
include/trace/events/afs.h | 74 ++-
include/trace/events/netfs.h | 201 +++++++
lib/iov_iter.c | 313 ++++++++++-
mm/filemap.c | 18 +
mm/readahead.c | 70 +++
net/rxrpc/recvmsg.c | 9 +-
49 files changed, 3749 insertions(+), 1573 deletions(-)
create mode 100644 fs/cachefiles/rdwr2.c
create mode 100644 fs/fscache/page2.c
create mode 100644 fs/netfs/Kconfig
create mode 100644 fs/netfs/Makefile
create mode 100644 fs/netfs/internal.h
create mode 100644 fs/netfs/read_helper.c
create mode 100644 fs/netfs/stats.c
create mode 100644 include/linux/netfs.h
create mode 100644 include/trace/events/netfs.h