Re: [PATCH RESEND] PCI: s390: Fix use-after-free of PCI bus resources with s390 per-function hotplug

From: Bjorn Helgaas
Date: Wed Mar 08 2023 - 13:38:17 EST


On Tue, Feb 28, 2023 at 10:08:45AM +0100, Niklas Schnelle wrote:
> On Fri, 2023-02-24 at 05:19 +0100, Lukas Wunner wrote:
> > On Thu, Feb 23, 2023 at 01:53:45PM -0600, Bjorn Helgaas wrote:
> > > Hmm. Good question. Off the top of my head, I can't explain the
> > > difference between pci_rescan_remove_lock and pci_bus_sem, so I'm
> > > confused, too. I added Lukas in case he has a ready explanation.
> >
> > pci_bus_sem is a global lock which protects the "devices" list of all
> > pci_bus structs.
> >
> > We do have a bunch of places left where the "devices" list is accessed
> > without holding pci_bus_sem, though I've tried to slowly eliminate
> > them.
> >
> > pci_rescan_remove_lock is a global "big kernel lock" which serializes
> > any device addition and removal.
> >
> > pci_rescan_remove_lock is known to be far too course-grained and thus
> > deadlock-prone, particularly if hotplug ports are nested (as is the
> > case with Thunderbolt). It needs to be split up into several smaller
> > locks which protect e.g. allocation of resources of a bus (bus numbers
> > or MMIO / IO space) and whatever else needs to be protected. It's just
> > that nobody has gotten around to identify what exactly needs to be
> > protected, adding the new locks and removing pci_rescan_remove_lock.
>
> Thanks for the insights. So from that description I think it might make
> sense to do this fix patch with the pci_rescan_remove_lock so it can be
> backported. Then we can take the opportunity to add a lock specific to
> the allocation/freeing of resources which would then replace at least
> this new directly and clearly resource related use of
> pci_rescan_remove_lock and potentially others we find.
> What do you think?

I don't think Lukas was suggesting that *you* need to split the
locking up, just that it *should* be split up someday. To me, that
looks like a project on its own that is beyond the scope of this
particular fix, so I think the pci_lock_rescan_remove() as you have it
here is fine for now.

When you fix up the superfluous "return", go ahead and add my

Acked-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>

to your patch. I assume it's easier for you to shepherd this through
the s390 tree, but let me know if you'd rather that I take it.

Bjorn