Re: [PATCH v2 2/4] PCI: hv: Add the support of hibernation

From: Lorenzo Pieralisi
Date: Mon Nov 25 2019 - 05:45:08 EST


On Sun, Nov 24, 2019 at 10:19:46PM +0000, Michael Kelley wrote:
> From: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> Sent: Thursday, November 21, 2019 3:44 AM
> >
> > On Thu, Nov 21, 2019 at 12:50:17AM +0000, Dexuan Cui wrote:
> > > > From: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx>
> > > > Sent: Wednesday, November 20, 2019 9:20 AM
> > > >
> > > > On Tue, Nov 19, 2019 at 11:16:56PM -0800, Dexuan Cui wrote:
> > > > > Implement the suspend/resume callbacks.
> > > > >
> > > > > We must make sure there is no pending work items before we call
> > > > > vmbus_close().
> > > >
> > > > Where ? Why ? Imagine a developer reading this log to try to understand
> > > > why you made this change, do you really think this commit log is
> > > > informative in its current form ?
> > > >
> > > > I am not asking a book but this is a significant feature please make
> > > > an effort to explain it (I can update the log for you but please
> > > > write one and I shall do it).
> > > >
> > > > Lorenzo
> > >
> > > Sorry for being sloppy on this patch's changelog! Can you please use the
> > > below? I can also post v3 with the new changelog if that's better.
> >
> > As you wish but more importantly get hyper-V maintainers to ACK these
> > changes since time is running out for v5.5.
> >
> > Lorenzo
> >
> > > PCI: hv: Add the support of hibernation
> > >
> > > hv_pci_suspend() runs in a process context as a callback in dpm_suspend().
> > > When it starts to run, the channel callback hv_pci_onchannelcallback(),
> > > which runs in a tasklet context, can be still running concurrently and
> > > scheduling new work items onto hbus->wq in hv_pci_devices_present() and
> > > hv_pci_eject_device(), and the work item handlers can access the vmbus
> > > channel, which can be being closed by hv_pci_suspend(), e.g. the work item
> > > handler pci_devices_present_work() -> new_pcichild_device() writes to
> > > the vmbus channel.
> > >
> > > To eliminate the race, hv_pci_suspend() disables the channel callback
> > > tasklet, sets hbus->state to hv_pcibus_removing, and re-enables the tasklet.
> > >
> > > This way, when hv_pci_suspend() proceeds, it knows that no new work item
> > > can be scheduled, and then it flushes hbus->wq and safely closes the vmbus
> > > channel.
> > >
> > > Thanks,
> > > -- Dexuan
>
> FWIW, I'd like to see the above level of detail also as comments in the code
> Itself so that whoever next looks at the code sees the explanation directly
> without having to review the commit logs.
>
> Also, the commit message doesn't say what the commit actually does and
> why. I'd suggest the commit message along these lines:
>
> Add suspend() and resume() functions so that Hyper-V virtual PCI devices are
> handled properly when the VM hibernates and resumes from hibernation.
>
> Note that the suspend() function must make sure there are no pending work
> items before calling vmbus_close(), since it runs in a process context as a
> callback in dpm_suspend(). When it starts to run, the channel callback
> hv_pci_onchannelcallback(), which runs in a tasklet context, can be still running
> concurrently and scheduling new work items onto hbus->wq in
> hv_pci_devices_present() and hv_pci_eject_device(), and the work item
> handlers can access the vmbus channel, which can be being closed by
> hv_pci_suspend(), e.g. the work item handler pci_devices_present_work() ->
> new_pcichild_device() writes to the vmbus channel.
>
> To eliminate the race, hv_pci_suspend() disables the channel callback
> tasklet, sets hbus->state to hv_pcibus_removing, and re-enables the tasklet.
> This way, when hv_pci_suspend() proceeds, it knows that no new work item
> can be scheduled, and then it flushes hbus->wq and safely closes the vmbus
> channel.

This is much better, thank you, if you are happy with the patches
please add your tags so that I can pull the series asap, hopefully
we can merge it in v5.5.

Thanks,
Lorenzo