Re: [PATCH v18 10/11] PCI/DPC: Add Error Disconnect Recover (EDR) support
From: Bjorn Helgaas
Date: Tue Mar 24 2020 - 17:37:15 EST
On Mon, Mar 23, 2020 at 05:26:07PM -0700, sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx wrote:
> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
>
> Error Disconnect Recover (EDR) is a feature that allows ACPI firmware to
> notify OSPM that a device has been disconnected due to an error condition
> (ACPI v6.3, sec 5.6.6). OSPM advertises its support for EDR on PCI devices
> via _OSC (see [1], sec 4.5.1, table 4-4). The OSPM EDR notify handler
> should invalidate software state associated with disconnected devices and
> may attempt to recover them. OSPM communicates the status of recovery to
> the firmware via _OST (sec 6.3.5.2).
>
> For PCIe, firmware may use Downstream Port Containment (DPC) to support
> EDR. Per [1], sec 4.5.1, table 4-6, even if firmware has retained control
> of DPC, OSPM may read/write DPC control and status registers during the EDR
> notification processing window, i.e., from the time it receives an EDR
> notification until it clears the DPC Trigger Status.
>
> Note that per [1], sec 4.5.1 and 4.5.2.4,
>
> 1. If the OS supports EDR, it should advertise that to firmware by
> setting OSC_PCI_EDR_SUPPORT in _OSC Support.
>
> 2. If the OS sets OSC_PCI_EXPRESS_DPC_CONTROL in _OSC Control to request
> control of the DPC capability, it must also set OSC_PCI_EDR_SUPPORT in
> _OSC Support.
>
> Add an EDR notify handler to attempt recovery.
>
> [1] Downstream Port Containment Related Enhancements ECN, Jan 28, 2019,
> affecting PCI Firmware Specification, Rev. 3.2
> https://members.pcisig.com/wg/PCI-SIG/document/12888
> Link: https://lore.kernel.org/r/9ae1d3285beeb81bbf85571a89b8f3d4451eae8f.1583286655.git.sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx
> Link: https://lore.kernel.org/r/246aa05acca8f0a7e6d20a65ab05af0027f60118.1583286655.git.sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx
> [bhelgaas: squash add/enable patches into one]
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
> Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> Cc: Len Brown <lenb@xxxxxxxxxx>
> Cc: "Rafael J. Wysocki" <rjw@xxxxxxxxxxxxx>
> +static int acpi_enable_dpc(struct pci_dev *pdev)
> +{
> + struct acpi_device *adev = ACPI_COMPANION(&pdev->dev);
> + union acpi_object *obj, argv4, req;
> + int status;
> +
> + /*
> + * Some firmware implementations will return default values for
> + * unsupported _DSM calls. So checking acpi_evaluate_dsm() return
> + * value for NULL condition is not a complete method for finding
> + * whether given _DSM function is supported or not. So use
> + * explicit func 0 call to find whether given _DSM function is
> + * supported or not.
> + */
> + status = acpi_check_dsm(adev->handle, &pci_acpi_dsm_guid, 5,
> + 1ULL << EDR_PORT_DPC_ENABLE_DSM);
This is really ugly. What's the story on this firmware? It sounds
defective to me.
Or is everybody that uses _DSM supposed to check before evaluating it?
E.g.,
if (!acpi_check_dsm(...))
return -EINVAL;
obj = acpi_evaluate_dsm(...);
If everybody is supposed to do this, it seems like the check part
should be moved into acpi_evaluate_dsm().
> + if (!status)
> + return 0;
> +
> + status = 0;
> + req.type = ACPI_TYPE_INTEGER;
> + req.integer.value = 1;
> +
> + argv4.type = ACPI_TYPE_PACKAGE;
> + argv4.package.count = 1;
> + argv4.package.elements = &req;
> +
> + /*
> + * Per Downstream Port Containment Related Enhancements ECN to PCI
> + * Firmware Specification r3.2, sec 4.6.12, EDR_PORT_DPC_ENABLE_DSM is
> + * optional. Return success if it's not implemented.
> + */
> + obj = acpi_evaluate_dsm(adev->handle, &pci_acpi_dsm_guid, 5,
> + EDR_PORT_DPC_ENABLE_DSM, &argv4);
> + if (!obj)
> + return 0;