Re: [PATCH 0/5] cxl: Log downport PCIe AER and CXL RAS error information
From: Terry Bowman
Date: Fri Oct 28 2022 - 10:29:50 EST
Hi Ariel,
On 10/28/22 07:30, Ariel.Sibley@xxxxxxxxxxxxx wrote:
>> -----Original Message-----
>> From: Terry Bowman <terry.bowman@xxxxxxx>
>> Sent: Friday, October 21, 2022 3:56 PM
>> To: alison.schofield@xxxxxxxxx; vishal.l.verma@xxxxxxxxx; dave.jiang@xxxxxxxxx; ira.weiny@xxxxxxxxx;
>> bwidawsk@xxxxxxxxxx; dan.j.williams@xxxxxxxxx
>> Cc: terry.bowman@xxxxxxx; linux-cxl@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; bhelgaas@xxxxxxxxxx;
>> rafael@xxxxxxxxxx; lenb@xxxxxxxxxx; Jonathan.Cameron@xxxxxxxxxx; dave@xxxxxxxxxxxx; rrichter@xxxxxxx
>> Subject: [PATCH 0/5] cxl: Log downport PCIe AER and CXL RAS error information
>>
>> This patchset adds CXL downport PCI AER and CXL RAS logging to the CXL
>> error handling. This is necessary for communicating CXL HW issues to users.
>> The included patches find and cache pointers to the AER and CXL RAS PCIe
>> capability structures. The cached pointers are then used to display the
>> error information in a later patch. These changes follow the CXL
>> specification, Chapter 8 'Control and Status Registers'.[1]
>>
>> The first patch enables CXL1.1 RCD support through the ACPI _OSC support
>> method.
>>
>> The 2nd and 3rd patches find and map PCIe AER and CXL RAS capabilities.
>>
>> The 4th patch enables AER error reporting.
>>
>> The 5th patch adds functionality to log the PCIe AER and RAS capabilities.
>>
>> TODO work remains to consolidate the HDM and CXL RAS register mapping
>> (patch#3). The current CXL RAS register mapping will be replaced to reuse
>> cxl_probe_component_regs() function as David Jiang and Alison Schofield
>> upstreamed. Should the same be done for the AER registers (patch#2)? The
>> AER registers are not in the component register block but are instead in
>> the downport and upport (RCRB).
>
> The RCD's AER registers are not in either the component register block or
> RCRB. They are in the RCiEP config space.
>
> Per CXL 3.0 Section 12.2.1.2 RCD Upstream Port-detected Errors:
> "2. Upstream Port RCRB shall not implement the AER Extended Capability."
> ...
> "4. CXL.io Functions log the received message in their respective AER Extended
> Capability."
>
I based this comment on CXL3.0 8.2.1.1 "RCH Downstream Port RCRB":
"The RCH Downstream Port RCRB is a 4-KB memory region that contains
registers based upon the PCIe-defined registers for a root port... The
RCH Downstream Port supported PCIe capabilities and extended
capabilities are listed in Table 8-18"
And Table 8-18 includes 'Advanced Error Reporting
Extended Capability' with no exceptions.
The RCD upstream port needs to be removed from my comment. Thank you for
pointing that out. My understanding is the RCH downstream port does
include the AER registers.
Regards,
Terry
>>
>> TODO work remains to add support for upports in some cases here where
>> downport is addressed. For instance, will need another aer_map to support
>> upport AER ?
>>
>> TODO work to support CXL2.0. Should be trivial since aer_cap and aer_stats
>> is member of 'struct pci_dev'.
>>
>> Base is from: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.kernel.org%2Fproject%2Fcxl%2Flist%2F%3Fseries%3D686272&data=05%7C01%7Cterry.bowman%40amd.com%7C121bfa9df0c44b311aef08dab8e03663%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638025570444835378%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ckPk6RyL61lsX%2BNYKLQ%2FzRgA2424ccLj%2B6FLG9K6Sdc%3D&reserved=0
>>
>> [1] - https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.computeexpresslink.org%2Fspec-landing&data=05%7C01%7Cterry.bowman%40amd.com%7C121bfa9df0c44b311aef08dab8e03663%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638025570444835378%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2For6BQCHX616kZL%2BFbSqOqT7hQYntiJYD%2BnpWTKkDXE%3D&reserved=0
>>
>> Terry Bowman (5):
>> cxl/acpi: Set ACPI's CXL _OSC to indicate CXL1.1 support
>> cxl/pci: Discover and cache pointer to RCD dport's PCIe AER capability
>> cxl/pci: Discover and cache pointer to RCD dport's CXL RAS registers
>> cxl/pci: Enable RCD dport AER reporting
>> cxl/pci: Log CXL device's PCIe AER and CXL RAS error information
>>
>> drivers/acpi/pci_root.c | 1 +
>> drivers/cxl/acpi.c | 56 +++++++
>> drivers/cxl/core/regs.c | 1 +
>> drivers/cxl/cxl.h | 13 ++
>> drivers/cxl/cxlmem.h | 3 +
>> drivers/cxl/mem.c | 2 +
>> drivers/cxl/pci.c | 319 ++++++++++++++++++++++++++++++++++++++++
>> drivers/pci/pcie/aer.c | 45 +++++-
>> include/linux/pci.h | 4 +
>> 9 files changed, 443 insertions(+), 1 deletion(-)
>>
>> --
>> 2.34.1
>