Re: [PATCH 1/3] x86/quirks: Scan all busses for early PCI quirks
From: Eric W. Biederman
Date: Mon Nov 16 2020 - 20:07:28 EST
Bjorn Helgaas <helgaas@xxxxxxxxxx> writes:
> I don't think passing the device information to the kdump kernel is
> really practical. The kdump kernel would use it to do PCI config
> writes to disable MSIs before enabling IRQs, and it doesn't know how
> to access config space that early.
I don't think it is particularly practical either. But in practice
on x86 it is either mmio writes or 0xcf8 style writes and we could
pass a magic table that would have all of that information.
> We could invent special "early config access" things, but that gets
> really complicated really fast. Config access depends on ACPI MCFG
> tables, firmware interfaces, and in many cases, on the native host
> bridge drivers in drivers/pci/controllers/.
I do agree that the practical problem with passing information early
is that gets us into the weeds and creates code that we only care
about in the case of kexec-on-panic. It is much better to make the
existing code more robust, so that we reduce our dependency on firmware
doing the right thing.
> I think we need to disable MSIs in the crashing kernel before the
> kexec. It adds a little more code in the crash_kexec() path, but it
> seems like a worthwhile tradeoff.
Disabling MSIs in the b0rken kernel is not possible.
Walking the device tree or even a significant subset of it hugely
decreases the chances that we will run into something that is incorrect
in the known broken kernel. I expect the code to do that would double
or triple the amount of code that must be executed in the known broken
kernel. The last time something like that happened (switching from xchg
to ordinary locks) we had cases that stopped working. Walking all of
the pci devices in the system is much more invasive.
That is not to downplay the problems of figuring out how to disable
things in early boot.
My two top candidates are poking the IOMMUs early to shut things off,
and figuring out if we can delay enabling interrupts until we have
initialized pci.
Poking at IOMMUs early should work for most systems with ``enterprise''
hardware. Systems where people care about kdump the most.
Eric