Re: [PATCH] PCIe AER: honor ACPI HEST FIRMWARE FIRST mode

From: Hidetoshi Seto
Date: Thu Oct 29 2009 - 23:24:48 EST


Matt Domsch wrote:
>> Matt Domsch wrote:
>>> + /* These three should never appear */
>>> + case ACPI_HEST_TYPE_NOT_USED3:
>>> + case ACPI_HEST_TYPE_NOT_USED4:
>>> + case ACPI_HEST_TYPE_NOT_USED5:
>>> + break;
>> Yes, these should never. But if there, what will happen?
>> I'd like to see a error message rather than hang in infinite loops.
>>
>>> + /* These should never appear either */
>>> + case ACPI_HEST_TYPE_RESERVED:
>>> + default:
>>> + break;
>> Ditto.
>
> It won't infinite loop, due to i incrementing. If one of these types
> appears early in the error_source list, it would prevent us from
> seeing the (correct) source types later in the list.

Ah, you are right. Thanks for correcting me.

> But we can't advance the pointer because we can't know by how much to
> advance it. We could print a debug message. How about:
>
> printk(KERN_DEBUG PREFIX
> "HEST Error Source list contains an obsolete type.\n");

Good.
And it would be informative to print the type number.
... "HEST Error Source list contains an obsolete type (%d).\n", hdr->type);

> At some point, the RESERVED type will move to higher number as new
> sources are defined in the spec, so that would be different message.
> How about:
>
> printk(KERN_DEBUG PREFIX
> "HEST Error Source list contains a reserved type.\n");

Of course good.
And print the type number too, please.

> and I'll have it print only once.

Nice!

>> One another concern I got here is if there was a seg_n, segment
>> number, how we can handle it... but it will be one of future works,
>> so this would be OK at this time.
>
> Yes, but this ACPI table doesn't have a domain / segment number in
> it. Does this test:
>
> if (p->flags & ACPI_HEST_FIRMWARE_FIRST &&
> (p->flags & ACPI_HEST_GLOBAL ||
> (p->bus == pci->bus->number &&
> p->device == PCI_SLOT(pci->devfn) &&
> p->function == PCI_FUNC(pci->devfn))))
> *firmware_first = 1;
>
> need to add an explicit test for && 0 == pci_domain_nr(pci->bus) ?

Maybe it would be better to have, to find problem early if it happens.

>>> @@ -35,6 +36,9 @@ int pci_enable_pcie_error_reporting(struct pci_dev *dev)
>>> u16 reg16 = 0;
>>> int pos;
>>>
>>> + if (dev->aer_firmware_first)
>>> + return -EIO;
>>> +
>> The aer_init() will be called for root ports (via aer_probe() of
>> aer service driver), but not for end point devices or so on.
>> So you need to call aer_init() for end points from here or somewhere
>> else, to set aer_firmware_first correctly.
>
> Good catch. I'll move the setting of dev->aer_firmware_first into the
> PCI device discovery path. By that point, ACPI is configured and
> available.

Please be careful that there could be hot-plugged devices later.

>>> + if (acpi_hest_firmware_first_pci(dev->port)) {
>>> + dev_printk(KERN_DEBUG, &dev->device,
>>> + "PCIe device errors handled by platform firmware.\n");
>>> + dev->port->aer_firmware_first=1;
>> Coding style requests you to add spaces before and after of "=".
>
> OK. This and the next will move as noted above.

Thanks,
H.Seto

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/