Re: [PATCH 14/33] fscache, cachefiles: Add alternate API to use kiocb for read/write to cache

From: David Howells
Date: Tue Feb 16 2021 - 10:11:17 EST


Christoph Hellwig <hch@xxxxxx> wrote:

> > Filesystems wanting to use the new API must #define FSCACHE_USE_NEW_IO_API
> > before #including the header
>
> What exactly does this ifdef buys us? It seems like the old and new
> APIs don't even conflict.

I was asked to add this. The APIs look like they don't conflict, but you
can't mix them for a given file because of the differing behaviour of the
PG_fscache flag. It also makes it much easier to make sure you don't miss
something. That has happened and it led to some strange effects before we
worked out what was going on.

> And if we really need an ifdef I'd rather use that for the old code to make
> grepping for that easier.

I can do it that way - but this doesn't require changing filesystems that
aren't being changed. The intent would be to eliminate the #ifdef in a cycle
or two anyway.

Besides, there are 5 filesystems that use this, and two of them are converted
here. grep would only return three hits: one each in fs/9p/cache.h,
fs/cifs/fscache.h and fs/nfs/fscache.h.

OTOH, I suppose it might dissuade people from adding new usages of the old
API.

> > + if (ki->term_func) {
> > + if (ret < 0)
> > + ki->term_func(ki->term_func_priv, ret);
> > + else
> > + ki->term_func(ki->term_func_priv, ki->skipped + ret);
>
> Why not simplify:
>
> if (ret > 0)
> ret += ki->skipped;
> ki->term_func(ki->term_func_priv, ret);

Could do that I suppose. The optimiser will make them the same anyway.

I still wonder if I should do something with ret2 as obtained from the kiocb
completion function:

static void cachefiles_read_complete(struct kiocb *iocb, long ret, long ret2)

Can we consolidate to one return value?

> > + /* If the caller asked us to seek for data before doing the read, then
> > + * we should do that now. If we find a gap, we fill it with zeros.
> > + */
>
> FYI, this is not the normal kernel comment style..

I've been following the network style.

> > + ret = rw_verify_area(READ, file, &ki->iocb.ki_pos, len - skipped);
> > + if (ret < 0)
> > + goto presubmission_error_free;
> > +
> > + get_file(ki->iocb.ki_filp);
> > +
> > + old_nofs = memalloc_nofs_save();
> > + ret = call_read_iter(file, &ki->iocb, iter);
> > + memalloc_nofs_restore(old_nofs);
>
> As mentioned before I think all this magic belongs in to a helper
> in the VFS.

You suggested vfs_iocb_iter_read() in your reply to another patch, but it
occurs to me that that doesn't have memalloc_nofs_*() in it. I could hoist
the memalloc_nofs stuff out and use those helpers.

> > +static const struct netfs_cache_ops cachefiles_netfs_cache_ops = {
> > + .end_operation = cachefiles_end_operation,
> > + .read = cachefiles_read,
> > + .write = cachefiles_write,
> > + .expand_readahead = NULL,
> > + .prepare_read = cachefiles_prepare_read,
> > +};
> ...
> Also at least in linux-next ->read and ->write seem to never actually
> be called.

See netfs_read_from_cache() and netfs_rreq_do_write_to_cache() in
fs/netfs/read_helper.c. Look for "cres->ops->".

> > +{
> > + struct cachefiles_object *object;
> > + struct cachefiles_cache *cache;
> > + struct path path;
> > + struct file *file;
> > +
> > + _enter("");
> > +
> > + object = container_of(op->op.object,
> > + struct cachefiles_object, fscache);
> > + cache = container_of(object->fscache.cache,
> > + struct cachefiles_cache, cache);
> > +
> > + path.mnt = cache->mnt;
> > + path.dentry = object->backer;
> > + file = open_with_fake_path(&path, O_RDWR | O_LARGEFILE | O_DIRECT,
> > + d_inode(object->backer), cache->cache_cred);
>
> I think this should be plain old dentry_open?

open_with_fake_path() sets FMODE_NOACCOUNT. In the fscache-iter branch, the
file is held open a lot longer and then ENFILE/EMFILE starts being a serious
problem.

That said, I'm considering changing things so that all the data in the cache
is held in one or a few files with an index to locate things - at which point
this issue goes away.

> > + op = fscache_alloc_retrieval(cookie, NULL, NULL, NULL);
> > + if (!op)
> > + return -ENOMEM;
> > + //atomic_set(&op->n_pages, 1);
>
> ???

I should remove that - it kind of got left behind. That was necessary for the
old API, but a whole load of this code, including the fscache_retrieval struct
will be going away when the cookie and operation handling get rewritten.

> > +{
> > + if (fscache_cookie_valid(cookie) && fscache_cookie_enabled(cookie))
> > + return __fscache_begin_read_operation(rreq, cookie);
> > + else
> > + return -ENOBUFS;
> > +}
>
> No need for an else after a return. I personally also prefer to always
> handle the error case first, ala:

It's not precisely an error case, more a "fallback required" case.

> if (!fscache_cookie_valid(cookie) || !fscache_cookie_enabled(cookie))
> return -ENOBUFS;
> return __fscache_begin_read_operation(rreq, cookie);
>
> Also do we really need this inline fast path to start with?

Yes. fscache might be compiled out, in which case we'll never go down the
slow path. And the common case is that cookie == NULL, so let's not jump out
of line if we don't have to.

David