RE: [PATCH 2/2] hv_balloon: Reorganize the probe function

From: Dexuan Cui
Date: Fri Jun 14 2019 - 19:13:38 EST


> From: Michael Kelley <mikelley@xxxxxxxxxxxxx>
> Sent: Friday, June 14, 2019 2:56 PM
> > ...
> > + ret = balloon_connect_vsp(dev);
> > + if (ret != 0)
> > + return ret;
> > +
> > dm_device.state = DM_INITIALIZED;
> > - last_post_time = jiffies;
>
> I was curious about the above deletion. But I guess the line
> is not needed as the time_after() check in post_status() should
> handle an initial value of 0 for last_post_time just fine.

In a 32-bit kernel, sizeof(unsigned long) is 4, and the global 32-bit
varilable "jiffies" can overflow in 49.7 days if HZ is defined as 1000;
so in theory there is a tiny chance time_after() can not work as
expected here (i.e. we're loading hv_balloon driver when the
"jiffies" is just about to overflow, which is highly unlikely in practice);
even if that happens, we do not care, since the consequence is
just that the memory pressure reporting is delayed by 1 second. :-)

> > +
> > + dm_device.thread =
> > + kthread_run(dm_thread_func, &dm_device, "hv_balloon");
> > + if (IS_ERR(dm_device.thread)) {
> > + ret = PTR_ERR(dm_device.thread);
> > + goto probe_error;
> > + }
>
> Just an observation: this thread creation now happens at the end of the
> probing process. But that's good, because in the old code, the thread
> was started and could run before the protocol version had been
> negotiated. So I'll assume your change here is intentional.

Yes, this is intentional.

> >
> > return 0;
> >
> > -probe_error2:
> > +probe_error:
> > + vmbus_close(dev->channel);
> > #ifdef CONFIG_MEMORY_HOTPLUG
> > + unregister_memory_notifier(&hv_memory_nb);
>
> Hmmm. Evidently the above cleanup was missing in the
> old code.

Yes.

> > restore_online_page_callback(&hv_online_page);
> > #endif
> > - kthread_stop(dm_device.thread);
> > -
> > -probe_error1:
> > - vmbus_close(dev->channel);
> > return ret;
> > }
> >
> > @@ -1734,11 +1742,11 @@ static int balloon_remove(struct hv_device
> *dev)
> > cancel_work_sync(&dm->balloon_wrk.wrk);
> > cancel_work_sync(&dm->ha_wrk.wrk);
> >
> > - vmbus_close(dev->channel);
> > kthread_stop(dm->thread);
> > + vmbus_close(dev->channel);
>
> Presumably this is an intentional ordering change as well.
> The kthread should be stopped before closing the channel.

Yes. The old code is buggy: after the vmbus_close(), there is
a small window in which the old code can still try to send
messages to the host via a freed ringbuffer, causing panic.

> > #ifdef CONFIG_MEMORY_HOTPLUG
> > - restore_online_page_callback(&hv_online_page);
> > unregister_memory_notifier(&hv_memory_nb);
> > + restore_online_page_callback(&hv_online_page);
>
> And you've changed the ordering of these steps so they are
> the inverse of when they are set up. Also a good cleanup ....

Yes. The change is not really necessary, but let's just do it
in a better manner.

>
> Reviewed-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>

Thaks for the detailed comments!

Thanks,
-- Dexuan