RE: [Patch v2 1/2] PCI: hv: Fix a race condition when removing the device

From: Long Li
Date: Sun Apr 25 2021 - 00:53:29 EST


> Subject: RE: [Patch v2 1/2] PCI: hv: Fix a race condition when removing the
> device
>
> > From: Long Li <longli@xxxxxxxxxxxxx>
> > Sent: Friday, April 23, 2021 11:49 AM
> > To: Dexuan Cui <decui@xxxxxxxxxxxxx>; longli@xxxxxxxxxxxxxxxxx; KY
> >
> > > Subject: RE: [Patch v2 1/2] PCI: hv: Fix a race condition when
> > > removing the device
> > >
> > > > From: Long Li <longli@xxxxxxxxxxxxx>
> > > > Sent: Friday, April 23, 2021 11:32 AM
> > > > > ...
> > > > > If we test "rmmod pci-hyperv", I suspect the warning will be printed:
> > > > > hv_pci_remove() -> hv_pci_bus_exit() ->
> hv_pci_start_relations_work():
> > > >
> > > > In most case, it will not print anything.
> > >
> > > If I read the code correctly, I think the warning is printed _every
> > > time_ we unload pci-hyperv.
> >
> > Okay I see what you mean. I'll remove this message.
>
> Here we just want to avoid the message every time the pci-hyperv driver is
> unloaded. We might want to see the possible message when the PCI device
> is removed, but it's ok to me if the message is unconditionally removed.
>
> The real issus with the patch is that the 'hpdev' struct is never freed when
> the driver is unloaded: if we print out the value of the ref counter in
> put_pcichild(), we would notice that the ref counter is still two when the
> driver is unloaded, i.e. memory leak occurs.
>
> Before the patch, hv_pci_remove() calls hv_pci_bus_exit() ->
> hv_pci_start_relations_work(), and the ref counter drops to zero in
> pci_devices_present_work() due to the two calls of put_pcichild().
>
> With the patch, when the driver is unloaded, pci_devices_present_work() is
> not scheduled, hence the ref counter doesn't drop to zero.

Yes, I also see the leak, thanks to this warning message. Those will get fixed in v3.