RE: [PATCH v2] hv_sock: Add support for delayed close

From: Dexuan Cui
Date: Thu May 16 2019 - 00:36:24 EST


> From: Sunil Muthuswamy <sunilmut@xxxxxxxxxxxxx>
> Sent: Tuesday, May 14, 2019 5:56 PM
> ...
> +static bool hvs_close_lock_held(struct vsock_sock *vsk)
> {
> ...
> + schedule_delayed_work(&vsk->close_work, HVS_CLOSE_TIMEOUT);

Reviewed-by: Dexuan Cui <decui@xxxxxxxxxxxxx>

The patch looks good to me. Thank you, Sunil!

Next, we're going to remove the "channel->rescind" check in
vmbus_hvsock_device_unregister() -- when doing that, IMO we need to
fix a potential race revealed by the schedule_delayed_work() in this
patch:

When hvs_close_timeout() finishes, the "sk" struct has been freed, but
vmbus_onoffer_rescind() -> channel->chn_rescind_callback(), i.e.
hvs_close_connection(), may be still running and referencing the "chan"
and "sk" structs (), which should no longer be referenced when
hvs_close_timeout() finishes, i.e. "get_per_channel_state(chan)" is no
longer safe. The problem is: currently there is no sync mechanism
between vmbus_onoffer_rescind() and hvs_close_timeout().

The race is a real issue only after we remove the "channel->rescind"
in vmbus_hvsock_device_unregister().

I guess we need to introduce a new single-threaded workqueue in the
vmbus driver, and offload both vmbus_onoffer_rescind() and
hvs_close_timeout() onto the new workqueue.

Thanks,
-- Dexuan