RE: [Patch v2 1/2] PCI: hv: Fix a race condition when removing the device

From: Dexuan Cui
Date: Sat Apr 24 2021 - 22:24:38 EST


> From: Long Li <longli@xxxxxxxxxxxxx>
> Sent: Friday, April 23, 2021 11:49 AM
> To: Dexuan Cui <decui@xxxxxxxxxxxxx>; longli@xxxxxxxxxxxxxxxxx; KY
>
> > Subject: RE: [Patch v2 1/2] PCI: hv: Fix a race condition when removing the
> > device
> >
> > > From: Long Li <longli@xxxxxxxxxxxxx>
> > > Sent: Friday, April 23, 2021 11:32 AM
> > > > ...
> > > > If we test "rmmod pci-hyperv", I suspect the warning will be printed:
> > > > hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_start_relations_work():
> > >
> > > In most case, it will not print anything.
> >
> > If I read the code correctly, I think the warning is printed _every time_ we
> > unload pci-hyperv.
>
> Okay I see what you mean. I'll remove this message.

Here we just want to avoid the message every time the pci-hyperv driver is
unloaded. We might want to see the possible message when the PCI device
is removed, but it's ok to me if the message is unconditionally removed.

The real issus with the patch is that the 'hpdev' struct is never freed when
the driver is unloaded: if we print out the value of the ref counter in
put_pcichild(), we would notice that the ref counter is still two when the
driver is unloaded, i.e. memory leak occurs.

Before the patch, hv_pci_remove() calls hv_pci_bus_exit() ->
hv_pci_start_relations_work(), and the ref counter drops to zero in
pci_devices_present_work() due to the two calls of put_pcichild().

With the patch, when the driver is unloaded, pci_devices_present_work()
is not scheduled, hence the ref counter doesn't drop to zero.