Re: [PATCH v8 07/20] cachefiles: document on-demand read mode
From: David Howells
Date: Mon Apr 11 2022 - 09:39:13 EST
Jeffle Xu <jefflexu@xxxxxxxxxxxxxxxxx> wrote:
> + (*) On-demand Read.
> +
Unnecessary extra blank line.
Jeffle Xu <jefflexu@xxxxxxxxxxxxxxxxx> wrote:
> +
> +
> +On-demand Read
> +==============
> +
> +When working in original mode, cachefiles mainly serves as a local cache for
> +remote networking fs, while in on-demand read mode, cachefiles can boost the
> +scenario where on-demand read semantics is needed, e.g. container image
> +distribution.
> +
> +The essential difference between these two modes is that, in original mode,
> +when cache miss, netfs itself will fetch data from remote, and then write the
> +fetched data into cache file. While in on-demand read mode, a user daemon is
> +responsible for fetching data and then writing to the cache file.
> +
> +``CONFIG_CACHEFILES_ONDEMAND`` shall be enabled to support on-demand read mode.
You're missing a few articles there. How about:
"""
When working in its original mode, cachefiles mainly serves as a local cache
for a remote networking fs - while in on-demand read mode, cachefiles can boost
the scenario where on-demand read semantics are needed, e.g. container image
distribution.
The essential difference between these two modes is that, in original mode,
when a cache miss occurs, the netfs will fetch the data from the remote server
and then write it to the cache file. With on-demand read mode, however,
fetching and the data and writing it into the cache is delegated to a user
daemon.
``CONFIG_CACHEFILES_ONDEMAND`` shall be enabled to support on-demand read mode.
"""
"should be enabled".
Also, two spaces after a full stop please (but not after the dot in a
contraction, e.g. "e.g.").
> +The on-demand read mode relies on a simple protocol used for communication
> +between kernel and user daemon. The model is like::
"The protocol can be modelled as"?
> +The cachefiles kernel module will send requests to
the
> user daemon when needed.
> +
the
> User daemon needs to poll on the devnode ('/dev/cachefiles') to check if
> +there's
a
> pending request to be processed. A POLLIN event will be returned
> +when there's
a
> pending request.
> +Then user daemon needs to read
"The user daemon [than] reads "
> the devnode to fetch one
one -> a
> request and process it
> +accordingly. It is worth nothing
nothing -> noting
> that each read only gets one request. When
> +finished processing the request,
the
> user daemon needs to write the reply to the
> +devnode.
> +Each request is started with a message header like::
"is started with" -> "starts with".
"like" -> "of the form".
> + * ``id`` is a unique ID identifying this request among all pending
> + requests.
What's the scope of the uniqueness of "id"? Is it just unique to a particular
cachefiles cache?
> + * ``len`` identifies the whole length of this request, including the
> + header and following type specific payload.
type-specific.
> +An optional parameter is added to "bind" command::
to the "bind" command.
> +When
the
> "bind" command takes
takes -> is given
> without argument, it defaults to the original mode.
> +When
the
> "bind" command takes
is given
> with
the
> "ondemand" argument, i.e. "bind ondemand",
> +on-demand read mode will be enabled.
> +OPEN Request
The
> +------------
> +
> +When
the
> netfs opens a cache file for the first time, a request with
the
> +CACHEFILES_OP_OPEN opcode, a.k.a
an
> OPEN request will be sent to
the
> user daemon. The
> +payload format is like::
format is like -> of the form
> +
> + struct cachefiles_open {
> + __u32 volume_key_size;
> + __u32 cookie_key_size;
> + __u32 fd;
> + __u32 flags;
> + __u8 data[];
> + };
> +
"where:"
> + * ``data`` contains
the
> volume_key and cookie_key in sequence.
Might be better to say "contains the volume_key followed directly by the
cookie_key. The volume key is a NUL-terminated string; cookie_key is binary
data.".
> +
> + * ``volume_key_size`` identifies
identifies -> indicates/supplies
> the size of
the
> volume key of the cache
> + file, in bytes. volume_key is of string format, with a suffix '\0'.
> +
> + * ``cookie_key_size`` identifies the size of cookie key of the cache
> + file, in bytes. cookie_key is of binary format, which is netfs
> + specific.
"... indicates the size of the cookie key in bytes."
> +
> + * ``fd`` identifies the
the -> an
> anonymous fd of
of -> referring to
> the cache file, with
with -> through
> which user
> + daemon can perform write/llseek file operations on the cache file.
> +
> +
> +
The
> OPEN request contains
a
> (volume_key, cookie_key, anon_fd) triple for
triplet for the
I would probably also use {...} rather than (...).
> corresponding
> +cache file. With this triple,
triplet, the
> user daemon could
could -> can
> fetch and write data into the
> +cache file in the background, even when kernel has not triggered the
the -> a
> cache miss
> +yet.
The
> User daemon is able to distinguish the requested cache file with the given
> +(volume_key, cookie_key), and write the fetched data into
the
> cache file with
with -> using
> the
> +given anon_fd.
> +
> +After recording the (volume_key, cookie_key, anon_fd) triple,
triplet, the
> user daemon shall
shall -> should
> +reply with
reply with -> complete the request by issuing a
> "copen" (complete open) command::
> +
> + copen <id>,<cache_size>
> +
> + * ``id`` is exactly the id field of the previous OPEN request.
> +
> + * When >= 0, ``cache_size`` identifies the size of the cache file;
> + when < 0, ``cache_size`` identifies the error code ecountered by the
> + user daemon.
identifies -> indicates
ecountered -> encountered
> +CLOSE Request
The
> +-------------
> +When
a
> cookie withdrawed,
withdrawed -> withdrawn
> a request with
a
> CACHEFILES_OP_CLOSE opcode, a.k.a CLOSE
> +request,
Maybe phrase as "... a close request (opcode CACHEFILES_OP_CLOSE),
> will be sent to user daemon. It will notify
the
> user daemon to close the
> +attached anon_fd. The payload format is like::
like -> of the form
> +
> + struct cachefiles_close {
> + __u32 fd;
> + };
> +
"where:"
> + * ``fd`` identifies the anon_fd to be closed, which is exactly the same
"... which should be the same as that provided to the OPEN request".
Is it possible for userspace to move the fd around with dup() or whatever?
> + with that in OPEN request.
> +
> +
> +READ Request
The
> +------------
> +
> +When on-demand read mode is turned on, and
a
> cache miss encountered,
the
> kernel will
> +send a request with CACHEFILES_OP_READ opcode, a.k.a READ request,
"send a READ request (opcode CACHEFILES_OP_READ)"
> to
the
> user
> +daemon. It will notify
It will notify -> This will ask/tell
> user daemon to fetch data in the requested file range.
> +The payload format is like::
format is like -> is of the form
> +
> + struct cachefiles_read {
> + __u64 off;
> + __u64 len;
> + __u32 fd;
> + };
> +
> + * ``off`` identifies the starting offset of the requested file range.
identifies -> indicates
> +
> + * ``len`` identifies the length of the requested file range.
> +
identifies -> indicates (you could alternatively say "specified")
> + * ``fd`` identifies the anonymous fd of the requested cache file. It is
> + guaranteed that it shall be the same with
"same with" -> "same as"
Since the kernel cannot make such a guarantee, I think you may need to restate
this as something like "Userspace must present the same fd as was given in the
previous OPEN request".
> the fd field in the previous
> + OPEN request.
> +
> +When receiving one
one -> a
> READ request,
the
> user daemon needs to fetch
the
> data of the
> +requested file range, and then write the fetched data
, and then write the fetched data -> and write it
> into cache file
cache file -> cache
> with
using
> the
> +given anonymous fd.
+ to indicate the destination.
> +
> +When finished
When finished -> To finish
> processing the READ request,
the
> user daemon needs to reply with
the
> +CACHEFILES_IOC_CREAD ioctl on the corresponding anon_fd::
> +
> + ioctl(fd, CACHEFILES_IOC_CREAD, id);
> +
> + * ``fd`` is exactly the fd field of the previous READ request.
Does that have to be true? What if userspace moves it somewhere else?
> +
> + * ``id`` is exactly the id field of the previous READ request.
is exactly the -> must match the
David