RE: [PATCH v3 11/12] Drivers: hv: vmbus: Suspend after cleaning up hv_sock and sub channels

From: Dexuan Cui
Date: Fri Aug 30 2019 - 22:55:05 EST


> From: Michael Kelley <mikelley@xxxxxxxxxxxxx>
> Sent: Friday, August 23, 2019 1:17 PM
>
> From: Dexuan Cui Sent: Monday, August 19, 2019 6:52 PM
> >
> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> > @@ -499,6 +499,8 @@ static void vmbus_add_channel_work(struct
> work_struct *work)
> > return;
> >
> > err_deq_chan:
> > + WARN_ON_ONCE(1);
> > +
>
> Why this change? I was thinking maybe it's a debug statement that got
> left in.

This is indeed a debug statement. :-)
I was not so sure if the error handling is absolutely correct there.
I think I can remove this WARN_ON_ONCE() since almost all of the possible
error paths have a pr_err(). We can make an extra patch to fix any bug in
the existing error handling code. BTW, it should be pretty unlikely to reach
the "err_deq_chan label" in practice.

> > @@ -1021,6 +1044,11 @@ static void vmbus_onoffer_rescind(struct
> > vmbus_channel_message_header *hdr)
> > }
> > mutex_unlock(&vmbus_connection.channel_mutex);
> > }
> > +
> > + /* The "channel" may have been freed. Do not access it any longer. */
> > +
> > + if (clean_up_chan_for_suspend)
> > + check_ready_for_suspend_event();
> > }
>
> Having to add the above lines twice is a bit clumsy, but the problem is
> the overall structure of the vmbus_onoffer_rescind. The early return in
> the case of a rescind_callback function is a bit weird, but I guess it makes
> sense since from what I can tell, only uio and hv_sock have rescind callback
> functions. Some minor restructuring might be warranted, but I don't feel
> strongly about it.

I agree. We should restructure vmbus_onoffer_rescind(), but I think we can
do it in a separate patch. Here in this patch I'm focused on the minimal
required change for hibernation.

> > + /*
> > + * Wait until all the sub-channels and hv_sock channels have been
> > + * cleaned up. Sub-channels should be destroyed upon suspend,
> otherwise
> > + * they would conflict with the new sub-channels that will be created
> > + * in the resume path. hv_sock channels should also be destroyed, but
> > + * a hv_sock channel of an established hv_sock connection can not be
> > + * really destroyed since it may still be referenced by the userspace
> > + * application, so we just force the hv_sock channel to be rescinded
> > + * by vmbus_force_channel_rescinded(), and the userspace application
> > + * will thoroughly destroy the channel after hibernation.
> > + */
> > + if (atomic_read(&vmbus_connection.nr_chan_close_on_suspend) > 0)
>
> At first glance, the above line seemed useless to me. You could just do the
> wait_for_completion() unconditionally. But is the intent to handle the case
> where the VM never had any sub-channels or hv_sock channels, and so
> nr_chan_close_on_suspend never went above 0?

Yes.

> If so, a comment might be helpful.
> > +wait_for_completion(&vmbus_connection.ready_for_suspend_event);

Ok. I'll add a commnt for this in v4.

Thanks,
-- Dexuan