Re: [V9fs-developer] [PATCH 4/5] 9p: introduce async read requests
From: Stefano Stabellini
Date: Fri Dec 09 2016 - 17:23:04 EST
On Fri, 9 Dec 2016, Dominique Martinet wrote:
> Stefano Stabellini wrote on Thu, Dec 08, 2016:
> > If the read is an async operation, send a 9p request and return
> > EIOCBQUEUED. Do not wait for completion.
> >
> > Complete the read operation from a callback instead.
> >
> > Signed-off-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> > ---
> > net/9p/client.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 86 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/9p/client.c b/net/9p/client.c
> > index eb589ef..f9f09db 100644
> > --- a/net/9p/client.c
> > +++ b/net/9p/client.c
> > @@ -28,6 +28,7 @@
> > #include <linux/module.h>
> > #include <linux/errno.h>
> > #include <linux/fs.h>
> > +#include <linux/pagemap.h>
> > #include <linux/poll.h>
> > #include <linux/idr.h>
> > #include <linux/mutex.h>
> > @@ -1554,13 +1555,68 @@ int p9_client_unlinkat(struct p9_fid *dfid, const char *name, int flags)
> > }
> > EXPORT_SYMBOL(p9_client_unlinkat);
> >
> > +static void
> > +p9_client_read_complete(struct p9_client *clnt, struct p9_req_t *req, int status)
> > +{
> > + int err, count, n, i, total = 0;
> > + char *dataptr, *to;
> > +
> > + if (req->status == REQ_STATUS_ERROR) {
> > + p9_debug(P9_DEBUG_ERROR, "req_status error %d\n", req->t_err);
> > + err = req->t_err;
> > + goto out;
> > + }
> > + err = p9_check_errors(clnt, req);
> > + if (err)
> > + goto out;
> > +
> > + err = p9pdu_readf(req->rc, clnt->proto_version,
> > + "D", &count, &dataptr);
> > + if (err) {
> > + trace_9p_protocol_dump(clnt, req->rc);
> > + goto out;
> > + }
> > + if (!count) {
> > + p9_debug(P9_DEBUG_ERROR, "count=%d\n", count);
> > + err = 0;
> > + goto out;
> > + }
> > +
> > + p9_debug(P9_DEBUG_9P, "<<< RREAD count %d\n", count);
> > + if (count > req->rsize)
> > + count = req->rsize;
> > +
> > + for (i = 0; i < ((req->rsize + PAGE_SIZE - 1) / PAGE_SIZE); i++) {
> > + to = kmap(req->pagevec[i]);
> > + to += req->offset;
> > + n = PAGE_SIZE - req->offset;
> > + if (n > count)
> > + n = count;
> > + memcpy(to, dataptr, n);
> > + kunmap(req->pagevec[i]);
> > + req->offset = 0;
> > + count -= n;
> > + total += n;
> > + }
> > +
> > + err = total;
> > + req->kiocb->ki_pos += total;
> > +
> > +out:
> > + req->kiocb->ki_complete(req->kiocb, err, 0);
> > +
> > + release_pages(req->pagevec, (req->rsize + PAGE_SIZE - 1) / PAGE_SIZE, false);
> > + kvfree(req->pagevec);
> > + p9_free_req(clnt, req);
> > +}
> > +
> > int
> > p9_client_read(struct p9_fid *fid, struct kiocb *iocb, u64 offset,
> > struct iov_iter *to, int *err)
> > {
> > struct p9_client *clnt = fid->clnt;
> > struct p9_req_t *req;
> > - int total = 0;
> > + int total = 0, i;
> > *err = 0;
> >
> > p9_debug(P9_DEBUG_9P, ">>> TREAD fid %d offset %llu %d\n",
> > @@ -1587,10 +1643,38 @@ int p9_client_unlinkat(struct p9_fid *dfid, const char *name, int flags)
> > req = p9_client_zc_rpc(clnt, P9_TREAD, to, NULL, rsize,
> > 0, 11, "dqd", fid->fid,
> > offset, rsize);
> > - } else {
> > + /* sync request */
> > + } else if(iocb == NULL || is_sync_kiocb(iocb)) {
> > non_zc = 1;
> > req = p9_client_rpc(clnt, P9_TREAD, "dqd", fid->fid, offset,
> > rsize);
> > + /* async request */
> > + } else {
>
> I'm not too familiar with iocb/how async IOs should work, but a logic
> question just to make sure that has been thought out:
> We prefer zc here to async, even if zc can be slow?
>
> Ideally at some point zc and async aren't exclusive so we'll have async
> zc and async normal, but for now I'd say async comes before zc - yes
> there will be an extra copy in memory, but it will be done
> asynchronously.
> Was it intentional to prefer zc here?
I wasn't sure what to do about zc. The backends I am testing with don't
support zc, so I didn't feel confident in changing its behavior. I think
whether zc is faster than async+copy depends on the specific benchmark.
iodepth and blocksize parameters in fio, for example. With iodepth=1, zc
would be faster, the higher the iodepth, the faster async+copy would
become in comparison. At some point async+copy will be faster than zc,
but I am not sure where is the threshold, it would probably be storage
backend dependent too. Maybe around iodepth=3. This is a reasonable
guess but I haven't run any numbers to confirm it.
That said, I am happy to follow any strategy you suggest in regards to zc.
> > + req = p9_client_get_req(clnt, P9_TREAD, "dqd", fid->fid, offset, rsize);
> > + if (IS_ERR(req)) {
> > + *err = PTR_ERR(req);
> > + break;
> > + }
> > + req->rsize = iov_iter_get_pages_alloc(to, &req->pagevec,
> > + (size_t)rsize, &req->offset);
> > + req->kiocb = iocb;
> > + for (i = 0; i < req->rsize; i += PAGE_SIZE)
> > + page_cache_get_speculative(req->pagevec[i/PAGE_SIZE]);
> > + req->callback = p9_client_read_complete;
> > +
> > + *err = clnt->trans_mod->request(clnt, req);
> > + if (*err < 0) {
> > + clnt->status = Disconnected;
> > + release_pages(req->pagevec,
> > + (req->rsize + PAGE_SIZE - 1) / PAGE_SIZE,
> > + true);
> > + kvfree(req->pagevec);
> > + p9_free_req(clnt, req);
> > + break;
> > + }
> > +
> > + *err = -EIOCBQUEUED;
> > + break;
>
> (Just a note to myself (or anyone who wants to do this) that a couple of
> places only look at err if size written is 0... They should check if err
> has been set)
>
> > }
> > if (IS_ERR(req)) {
> > *err = PTR_ERR(req);
>