RE: [PATCH 1/1] cxl/acpi.c: Add buggy BIOS hint for CXL ACPI lookup failure

From: Dan Williams
Date: Mon Apr 08 2024 - 12:55:09 EST


ppwaskie@ wrote:
> From: PJ Waskiewicz <ppwaskie@xxxxxxxxxx>
>
> Currently, Type 3 CXL devices (CXL.mem) can train using host CXL
> drivers on Emerald Rapids systems. However, on some production
> systems from some vendors, a buggy BIOS exists that improperly
> populates the ACPI => PCI mappings. This leads to the cxl_acpi
> driver to fail probe when it cannot find the root port's _UID, in
> order to look up the device's CXL attributes in the CEDT.
>
> Add a bit more of a descriptive message that the lookup failure
> could be a bad BIOS, rather than just "failed."

Makes sense, but is the goal here to name and shame the BIOS, or find a
potential quirk workaround? Presumably we could fall back to parsing
_UID instead of a string and then get some guidance from said BIOS about
how to lookup the corresponding ACPI0016 device from that identifier.

In other words, I see this patch as a warning shot of, "hey,
$platform_vendor if you
don't want folks to RMA these platforms please tell us how to do the
association Linux expects per the spec". Otherwise, this can escalate to
a loud WARN_TAINT(TAINT_FIRMWARE_WORKAROUND...), but I first want more
details from this platform like an acpidump and the exact error code
acpi_evaluate_integer() is returning.