Re: Why set .suppress_bind_attrs even though .remove() implemented?

From: Marc Zyngier
Date: Mon Jul 25 2022 - 13:35:39 EST


On Mon, 25 Jul 2022 16:18:48 +0100,
Johan Hovold <johan@xxxxxxxxxx> wrote:
>
> On Mon, Jul 25, 2022 at 03:43:40PM +0100, Marc Zyngier wrote:
> > On Mon, 25 Jul 2022 14:25:49 +0100,
> > Johan Hovold <johan@xxxxxxxxxx> wrote:
> > >
> > > [ +CC: maz ]
> > >
> > > On Fri, Jul 22, 2022 at 09:38:58AM -0500, Bjorn Helgaas wrote:
> > > > On Fri, Jul 22, 2022 at 03:26:44PM +0200, Johan Hovold wrote:
> > > > > On Thu, Jul 21, 2022 at 05:21:22PM -0500, Bjorn Helgaas wrote:
> > > >
> > > > > > qcom is a DWC driver, so all the IRQ stuff happens in
> > > > > > dw_pcie_host_init(). qcom_pcie_remove() does call
> > > > > > dw_pcie_host_deinit(), which calls irq_domain_remove(), but nobody
> > > > > > calls irq_dispose_mapping().
> > > > > >
> > > > > > I'm thoroughly confused by all this. But I suspect that maybe I
> > > > > > should drop the "make qcom modular" patch because it seems susceptible
> > > > > > to this problem:
> > > > > >
> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=pci/ctrl/qcom&id=41b68c2d097e
> > > > >
> > > > > That should not be necessary.
> > > > >
> > > > > As you note above, interrupt handling is implemented in dwc core so if
> > > > > there are any issue here at all, which I doubt, then all of the dwc
> > > > > drivers that currently can be built as modules would all be broken and
> > > > > this would need to be fixed in core.
> > > >
> > > > I don't know yet whether there's an issue. We need a clear argument
> > > > for why there is or is not. The fact that others might be broken is
> > > > not an argument for breaking another one ;)
> > >
> > > It's not breaking anything that is currently working, and if there's
> > > some corner case during module unload, that's not the end of the world
> > > either.
> >
> > It may not be the end of the world for you, but you have absolutely no
> > idea of what dangling pointers to kernel memory will do on a user
> > machine, nor how this can be further exploited. Unloading a module
> > should never result in an unsafe kernel.
>
> Since when is unloading modules something that is expected to work
> perfectly? I keep hearing "well, don't do that then" when someone
> complains about unloading this module while doing this or that broke
> something. (And it's only root that can unload modules in the first
> place.)

Well, maybe I have higher standards. For the stuff I maintain, I now
point-blank refuse to support module unloading if this can result in a
crash. Or worse.

> If this was the general understanding, then it seems the only option
> would be to disable module unloading completely as module remove code
> almost by definition gets less testing and is subject to bit rot.

My personal preference would be to prevent module unloading by default
if the probing has succeeded, and have modules to actually buy into
unloading. But that ship has sailed a long time ago.

> It's useful for developers, but use it at your own risk.
>
> That said, I agree that if something is next to impossible to get right,
> as may be the case with interrupt controllers generally, then fine,
> let's disable module unloading for that class of drivers.
>
> And this would mean disabling driver unbind for the 20+ driver PCI
> drivers that currently implement it to some degree.

That would be Bjorn's and Lorenzo's call.

> Also note that we only appear to have some 60 drivers in the tree that
> can be built as modules but cannot be unloaded (if my grep patterns
> were correct).

I'm not surprised. Preventing module unload requires extra "code", and
hardly anyone cares.

>
> > > > > I've been using the modular pcie-qcom patch for months now, unloading
> > > > > and reloading the driver repeatedly to test power sequencing, without
> > > > > noticing any problems whatsoever.
> > > >
> > > > Pali's commit log suggests that unloading the module is not, by
> > > > itself, enough to trigger the problem:
> > > >
> > > > https://lore.kernel.org/linux-pci/20220709161858.15031-1-pali@xxxxxxxxxx/
> > > >
> > > > Can you test the scenario he mentions?
> > >
> > > Turns out the pcie-qcom driver does not support legacy interrupts so
> > > there's no risk of there being any lingering mappings if I understand
> > > things correctly.
> >
> > It still does MSIs, thanks to dw_pcie_host_init(). If you can remove
> > the driver while devices are up and running with MSIs allocated,
> > things may get ugly if things align the wrong way (if a driver still
> > has a reference to an irq_desc or irq_data, for example).
>
> That is precisely the way I've been testing it and everything appears
> to be tore down as it should.
>
> And a PCI driver that has been unbound should have released its
> resources, or that's a driver bug. Right?

But that's the thing: you can easily remove part of the infrastructure
without the endpoint driver even noticing. It may not happen in your
particular case if removing the RC driver will also nuke the endpoints
in the process, but I can't see this is an absolute guarantee. The
crash pointed to by an earlier email is symptomatic of it.

> And for the OF INTx case you mentioned earlier, aren't those mapped by
> PCI core and could in theory be released by core as well?

Potentially, though I haven't tried to follow the life cycle of those.
The whole thing is pretty fragile, and this sort of resource is rarely
expected to be removed...

M.

--
Without deviation from the norm, progress is not possible.