Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is disconnected

From: Alex_Gagniuc
Date: Wed Nov 14 2018 - 15:52:24 EST


On 11/14/2018 02:27 PM, Keith Busch wrote:
> On Wed, Nov 14, 2018 at 07:22:04PM +0000, Alex_Gagniuc@xxxxxxxxxxxx wrote:
>> On 11/14/2018 12:00 AM, Bjorn Helgaas wrote:
>>> Just to make sure we're on the same page, can you point me to this
>>> rule? I do see that OSPM must request control of AER using _OSC
>>> before it touches the AER registers. What I don't see is the
>>> connection between firmware-first and the AER registers.
>>
>> ACPI 6.2 - 6.2.11.3, Table 6-197:
>>[...]
>> Maybe Keith knows better why we're doing it this way. From ACPI text, it
>> doesn't seem that control of AER would be tied to HEST entries, although
>> in practice, it is.
>
> I'm not sure, that predates me. HEST does have a FIRMWARE_FIRST flag, but
> spec does not say anymore on relation to _OSC control or AER capability.
> Nothing in PCIe spec either.

Speaking to one of the PCIe (and _HPX type 3) spec authors, ownership of
AER should be determined by _OSC. period. The result of _OSC applies to
every device under the root port. This crap we do with checking HEST is
crap.

If I'm not stepping on anyone toes, and there's no known unintended
consequences, I can look at patching this up. I'm not promising a patch,
though, but it's exactly the sort of thing I like to fix.

> I also don't know why Linux disables the AER driver if only one
> device has a FIRMWARE_FIRST HEST. Shouldn't that just be a per-device
> decision?

I think the logic is if one HEST entry has both FFS and GLOBAL flags
set, then then disable AER services for all devices. It works in
practice better than it works in theory. I think _OSC should be the
determining factor here, not HEST.

>>> The closest I can find is the "Enabled" field in the HEST PCIe
>>> AER structures (ACPI v6.2, sec 18.3.2.4, .5, .6), where it says:
>>> [...]
>>> AFAICT, Linux completely ignores the Enabled field in these
>>> structures.
>>
>> I don't think ignoring the field is a problem:
>> * With FFS, OS should ignore it.
>> * Without FFS, we have control, and we get to make the decisions anyway.
>> In the latter case we decide whether to use AER, independent of the crap
>> in ACPI. I'm not even sure why "Enabled" matters in native AER handling.
>> Probably one of the check-boxes in "Binary table designer's handbook"?
>
> And why doesn't Linux do anything with _OSC response other than logging
> it? If OS control wasn't granted, shouldn't that take priority over HEST?

But it does in portdrv_core.c:

if (dev->aer_cap && pci_aer_available() &&
(pcie_ports_native || host->native_aer)) {
services |= PCIE_PORT_SERVICE_AER;

That flag later creates a pcie device that allows aerdrv to attach to.

Alex