Re: [PATCH v6 12/13] xen/pvcalls: implement release command

From: Boris Ostrovsky
Date: Thu Oct 26 2017 - 10:10:07 EST


On 10/25/2017 07:00 PM, Stefano Stabellini wrote:
> On Wed, 25 Oct 2017, Boris Ostrovsky wrote:
>> On 10/24/2017 01:33 PM, Stefano Stabellini wrote:
>>> Send PVCALLS_RELEASE to the backend and wait for a reply. Take both
>>> in_mutex and out_mutex to avoid concurrent accesses. Then, free the
>>> socket.
>>>
>>> For passive sockets, check whether we have already pre-allocated an
>>> active socket for the purpose of being accepted. If so, free that as
>>> well.
>>>
>>> Signed-off-by: Stefano Stabellini <stefano@xxxxxxxxxxx>
>>> CC: boris.ostrovsky@xxxxxxxxxx
>>> CC: jgross@xxxxxxxx
>>> ---
>>> drivers/xen/pvcalls-front.c | 100 ++++++++++++++++++++++++++++++++++++++++++++
>>> drivers/xen/pvcalls-front.h | 1 +
>>> 2 files changed, 101 insertions(+)
>>>
>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>> index 4a413ff..7abc039 100644
>>> --- a/drivers/xen/pvcalls-front.c
>>> +++ b/drivers/xen/pvcalls-front.c
>>> @@ -199,6 +199,23 @@ static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id)
>>> static void pvcalls_front_free_map(struct pvcalls_bedata *bedata,
>>> struct sock_mapping *map, bool locked)
>>> {
>>> + int i;
>>> +
>>> + unbind_from_irqhandler(map->active.irq, map);
>>> +
>>> + if (!locked)
>>> + spin_lock(&bedata->socket_lock);
>>> + if (!list_empty(&map->list))
>>> + list_del_init(&map->list);
>>> + if (!locked)
>>> + spin_unlock(&bedata->socket_lock);
>>> +
>>> + for (i = 0; i < (1 << PVCALLS_RING_ORDER); i++)
>>> + gnttab_end_foreign_access(map->active.ring->ref[i], 0, 0);
>>> + gnttab_end_foreign_access(map->active.ref, 0, 0);
>>> + free_page((unsigned long)map->active.ring);
>>> +
>>> + kfree(map);
>>> }
>>>
>>> static irqreturn_t pvcalls_front_conn_handler(int irq, void *sock_map)
>>> @@ -966,6 +983,89 @@ unsigned int pvcalls_front_poll(struct file *file, struct socket *sock,
>>> return ret;
>>> }
>>>
>>> +int pvcalls_front_release(struct socket *sock)
>>> +{
>>> + struct pvcalls_bedata *bedata;
>>> + struct sock_mapping *map;
>>> + int req_id, notify, ret;
>>> + struct xen_pvcalls_request *req;
>>> +
>> ..
>>
>>> +
>>> + if (map->active_socket) {
>>> + /*
>>> + * Set in_error and wake up inflight_conn_req to force
>>> + * recvmsg waiters to exit.
>>> + */
>>> + map->active.ring->in_error = -EBADF;
>>> + wake_up_interruptible(&map->active.inflight_conn_req);
>>> +
>>> + /*
>>> + * Wait until there are no more waiters on the mutexes.
>>> + * We know that no new waiters can be added because sk_send_head
>>> + * is set to NULL -- we only need to wait for the existing
>>> + * waiters to return.
>>> + */
>>> + while (!mutex_trylock(&map->active.in_mutex) ||
>>> + !mutex_trylock(&map->active.out_mutex))
>>> + cpu_relax();
>>> +
>>> + pvcalls_front_free_map(bedata, map, false);
>>> + } else {
>>> + spin_lock(&bedata->socket_lock);
>>> + if (READ_ONCE(map->passive.inflight_req_id) !=
>>> + PVCALLS_INVALID_ID) {
>>> + pvcalls_front_free_map(bedata,
>>> + map->passive.accept_map, true);
>>> + }
>>> + list_del(&map->list);
>>> + kfree(map);
>>> + spin_unlock(&bedata->socket_lock);
>> We have different locking rules in pvcalls_front_free_map() for each of
>> those clauses in that in the first case we are doing grant table
>> operations and free_page() without the lock and in the second case we
>> are holding it. Is it possible to restructure this so that we prune the
>> lists under the lock (possibly in this routine) and call
>> pvcalls_front_free_map() lock-less?
> Yes, it is possible. However, pvcalls_front_free_map is called from a
> couple of other places (pvcalls_front_accept and pvcalls_front_remove)
> and we would have to add the code to remove the map from the list there
> as well. I am not sure it is worth it.
>
> I don't have a strong opinion on this. Let me know which way you prefer.


I didn't realize this is called from multiple places.

How about

spin_lock(&bedata->socket_lock);
if (READ_ONCE(map->passive.inflight_req_id) !=
PVCALLS_INVALID_ID) {
if (!list_empty(&map->passive.accept_map->list))
list_del(&map->passive.accept_map->list);
}
list_del(&map->list);
spin_unlock(&bedata->socket_lock);

pvcalls_front_free_map(bedata, map->passive.accept_map, true);
kfree(map);


(I may have messed up list pointers here)

This would be slightly inefficient in that you drop the lock and then grab it again (only to find that the list is empty, presumably) but it makes pvcalls_front_free_map()'s behavior more consistent wrt locking. (Or maybe you will need to pass a bool to pvcalls_front_free_map() to indicate whether the map needs to be removed from the list for other call sites, in which case it will replace the 'locked' argument)


-boris