Re: [PATCH v3 08/13] xen/pvcalls: implement accept command
From: Stefano Stabellini
Date: Fri Sep 08 2017 - 18:16:09 EST
On Mon, 14 Aug 2017, Boris Ostrovsky wrote:
> On 07/31/2017 06:57 PM, Stefano Stabellini wrote:
> > Introduce a waitqueue to allow only one outstanding accept command at
> > any given time and to implement polling on the passive socket. Introduce
> > a flags field to keep track of in-flight accept and poll commands.
> >
> > Send PVCALLS_ACCEPT to the backend. Allocate a new active socket. Make
> > sure that only one accept command is executed at any given time by
> > setting PVCALLS_FLAG_ACCEPT_INFLIGHT and waiting on the
> > inflight_accept_req waitqueue.
> >
> > Convert the new struct sock_mapping pointer into an uint64_t and use it
> > as id for the new socket to pass to the backend.
> >
> > Check if the accept call is non-blocking: in that case after sending the
> > ACCEPT command to the backend store the sock_mapping pointer of the new
> > struct and the inflight req_id then return -EAGAIN (which will respond
> > only when there is something to accept). Next time accept is called,
> > we'll check if the ACCEPT command has been answered, if so we'll pick up
> > where we left off, otherwise we return -EAGAIN again.
> >
> > Note that, differently from the other commands, we can use
> > wait_event_interruptible (instead of wait_event) in the case of accept
> > as we are able to track the req_id of the ACCEPT response that we are
> > waiting.
> >
> > Signed-off-by: Stefano Stabellini <stefano@xxxxxxxxxxx>
> > CC: boris.ostrovsky@xxxxxxxxxx
> > CC: jgross@xxxxxxxx
> > ---
> > drivers/xen/pvcalls-front.c | 111
> > ++++++++++++++++++++++++++++++++++++++++++++
> > drivers/xen/pvcalls-front.h | 3 ++
> > 2 files changed, 114 insertions(+)
> >
> > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > index b2757f5..f83b910 100644
> > --- a/drivers/xen/pvcalls-front.c
> > +++ b/drivers/xen/pvcalls-front.c
> > @@ -65,6 +65,16 @@ struct sock_mapping {
> > #define PVCALLS_STATUS_BIND 1
> > #define PVCALLS_STATUS_LISTEN 2
> > uint8_t status;
> > + /*
> > + * Internal state-machine flags.
> > + * Only one accept operation can be inflight for a socket.
> > + * Only one poll operation can be inflight for a given socket.
> > + */
> > +#define PVCALLS_FLAG_ACCEPT_INFLIGHT 0
> > + uint8_t flags;
> > + uint32_t inflight_req_id;
> > + struct sock_mapping *accept_map;
> > + wait_queue_head_t inflight_accept_req;
> > } passive;
> > };
> > };
> > @@ -414,6 +424,107 @@ int pvcalls_front_listen(struct socket *sock, int
> > backlog)
> > return ret;
> > }
> > +int pvcalls_front_accept(struct socket *sock, struct socket *newsock, int
> > flags)
> > +{
> > + struct pvcalls_bedata *bedata;
> > + struct sock_mapping *map;
> > + struct sock_mapping *map2 = NULL;
> > + struct xen_pvcalls_request *req;
> > + int notify, req_id, ret, evtchn, nonblock;
> > +
> > + if (!pvcalls_front_dev)
> > + return -ENOTCONN;
> > + bedata = dev_get_drvdata(&pvcalls_front_dev->dev);
> > +
> > + map = (struct sock_mapping *) READ_ONCE(sock->sk->sk_send_head);
> > + if (!map)
> > + return -ENOTSOCK;
> > +
> > + if (map->passive.status != PVCALLS_STATUS_LISTEN)
> > + return -EINVAL;
> > +
> > + nonblock = flags & SOCK_NONBLOCK;
> > + /*
> > + * Backend only supports 1 inflight accept request, will return
> > + * errors for the others
> > + */
> > + if (test_and_set_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
> > + (void *)&map->passive.flags)) {
> > + req_id = READ_ONCE(map->passive.inflight_req_id);
> > + if (req_id != PVCALLS_INVALID_ID &&
> > + READ_ONCE(bedata->rsp[req_id].req_id) == req_id)
> > + goto received;
> > + if (nonblock)
> > + return -EAGAIN;
> > + if (wait_event_interruptible(map->passive.inflight_accept_req,
> > + !test_and_set_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
> > + (void *)&map->passive.flags)))
> > + return -EINTR;
> > + }
> > +
> > + spin_lock(&bedata->pvcallss_lock);
> > + ret = get_request(bedata, &req_id);
> > + if (ret < 0) {
> > + spin_unlock(&bedata->pvcallss_lock);
> > + return ret;
> > + }
> > + map2 = kzalloc(sizeof(*map2), GFP_KERNEL);
> > + if (map2 == NULL)
> > + return -ENOMEM;
> > + ret = create_active(map2, &evtchn);
> > + if (ret < 0) {
> > + kfree(map2);
>
>
> In the connect patch create_active() frees maps2 (and I had some comments
> about it)
Yes, you were right, there is no need to free map in create_active. map
will be freed by the next "close" call on the passive socket. However,
we do need to free map2 because that has been created in this call few
lines above.
> > + return -ENOMEM;
>
> Both error paths need an unlock.
You are right!
> > + }
> > + list_add_tail(&map2->list, &bedata->socket_mappings);
> > +
> > + req = RING_GET_REQUEST(&bedata->ring, req_id);
> > + req->req_id = req_id;
> > + req->cmd = PVCALLS_ACCEPT;
> > + req->u.accept.id = (uint64_t) map;
> > + req->u.accept.ref = map2->active.ref;
> > + req->u.accept.id_new = (uint64_t) map2;
> > + req->u.accept.evtchn = evtchn;
> > + map->passive.accept_map = map2;
> > +
> > + bedata->ring.req_prod_pvt++;
> > + RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&bedata->ring, notify);
> > + spin_unlock(&bedata->pvcallss_lock);
> > + if (notify)
> > + notify_remote_via_irq(bedata->irq);
> > + if (nonblock) {
> > + WRITE_ONCE(map->passive.inflight_req_id, req_id);
> > + return -EAGAIN;
>
> Would it be worth checking (maybe a few times) whether the response has come
> back?
Yes, it is something to keep in mind. I'll add an in-code comment to
remember. I don't think this is a path worth optimizing now because it
is not very common to set the passive socket as non-blocking. Many
implementations leave the passive socket as blocking and use "poll" to
check if there is something to do before calling "accept".
> > + }
> > +
> > + if (wait_event_interruptible(bedata->inflight_req,
> > + READ_ONCE(bedata->rsp[req_id].req_id) == req_id))
> > + return -EINTR;
> > +
> > +received:
> > + map2 = map->passive.accept_map;
>
> I think this could go to the inflight check in the beginning of this routine
> (before the 'goto'). Otherwise map2 has already been assigned above.
Yes, I'll do that. The code worked anyway because of the
map->passive.accept_map = map2;
assignment in other path.
> > + map2->sock = newsock;
> > + newsock->sk = kzalloc(sizeof(*newsock->sk), GFP_KERNEL);
> > + if (!newsock->sk) {
> > + WRITE_ONCE(bedata->rsp[req_id].req_id, PVCALLS_INVALID_ID);
> > + WRITE_ONCE(map->passive.inflight_req_id, PVCALLS_INVALID_ID);
> > + pvcalls_front_free_map(bedata, map2);
> > + kfree(map2);
> > + return -ENOMEM;
> > + }
> > + WRITE_ONCE(newsock->sk->sk_send_head, (void *)map2);
> > +
> > + clear_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT, (void *)&map->passive.flags);
> > + wake_up(&map->passive.inflight_accept_req);
> > +
> > + ret = bedata->rsp[req_id].ret;