Re: [PATCH v2 1/4] efi/x86: Remove EfiMemoryMappedIO from E820 map

From: Hans de Goede
Date: Fri Dec 09 2022 - 06:05:58 EST


Hi,

On 12/9/22 09:06, Hans de Goede wrote:
> Hi,
>
> One comment (logging bug in patch) below:
>
> On 12/8/22 20:03, Bjorn Helgaas wrote:
>> From: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
>>
>> Firmware can use EfiMemoryMappedIO to request that MMIO regions be mapped
>> by the OS so they can be accessed by EFI runtime services, but should have
>> no other significance to the OS (UEFI r2.10, sec 7.2). However, most
>> bootloaders and EFI stubs convert EfiMemoryMappedIO regions to
>> E820_TYPE_RESERVED entries, which prevent Linux from allocating space from
>> them (see remove_e820_regions()).
>>
>> Some platforms use EfiMemoryMappedIO entries for PCI MMCONFIG space and PCI
>> host bridge windows, which means Linux can't allocate BAR space for
>> hot-added devices.
>>
>> Remove large EfiMemoryMappedIO regions from the E820 map to avoid this
>> problem.
>>
>> Leave small (< 256KB) EfiMemoryMappedIO regions alone because on some
>> platforms, these describe non-window space that's included in host bridge
>> _CRS. If we assign that space to PCI devices, they don't work. On the
>> Lenovo X1 Carbon, this leads to suspend/resume failures.
>>
>> The previous solution to the problem of allocating BARs in these regions
>> was to add pci_crs_quirks[] entries to disable E820 checking for these
>> machines (see d341838d776a ("x86/PCI: Disable E820 reserved region clipping
>> via quirks")):
>>
>> Acer DMI_PRODUCT_NAME Spin SP513-54N
>> Clevo DMI_BOARD_NAME X170KM-G
>> Lenovo DMI_PRODUCT_VERSION *IIL*
>>
>> Florent reported the BAR allocation issue on the Clevo NL4XLU. We could
>> add another quirk for the NL4XLU, but I hope this generic change can solve
>> it for many machines without having to add quirks.
>>
>> This change has been tested on Clevo X170KM-G (Konrad) and Lenovo Ideapad
>> Slim 3 (Matt) and solves the problem even when overriding the existing
>> quirks by booting with "pci=use_e820".
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216565 Clevo NL4XLU
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206459#c78 Clevo X170KM-G
>> Link: https://bugzilla.redhat.com/show_bug.cgi?id=1868899 Ideapad Slim 3
>> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2029207 X1 Carbon
>> Reported-by: Florent DELAHAYE <kernelorg@xxxxxxxxx>
>> Tested-by: Konrad J Hambrick <kjhambrick@xxxxxxxxx>
>> Tested-by: Matt Hansen <2lprbe78@xxxxxxxx>
>> Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
>> Cc: Hans de Goede <hdegoede@xxxxxxxxxx>
>> ---
>> arch/x86/platform/efi/efi.c | 46 +++++++++++++++++++++++++++++++++++++
>> 1 file changed, 46 insertions(+)
>>
>> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
>> index ebc98a68c400..dee1852e95cd 100644
>> --- a/arch/x86/platform/efi/efi.c
>> +++ b/arch/x86/platform/efi/efi.c
>> @@ -303,6 +303,50 @@ static void __init efi_clean_memmap(void)
>> }
>> }
>>
>> +/*
>> + * Firmware can use EfiMemoryMappedIO to request that MMIO regions be
>> + * mapped by the OS so they can be accessed by EFI runtime services, but
>> + * should have no other significance to the OS (UEFI r2.10, sec 7.2).
>> + * However, most bootloaders and EFI stubs convert EfiMemoryMappedIO
>> + * regions to E820_TYPE_RESERVED entries, which prevent Linux from
>> + * allocating space from them (see remove_e820_regions()).
>> + *
>> + * Some platforms use EfiMemoryMappedIO entries for PCI MMCONFIG space and
>> + * PCI host bridge windows, which means Linux can't allocate BAR space for
>> + * hot-added devices.
>> + *
>> + * Remove large EfiMemoryMappedIO regions from the E820 map to avoid this
>> + * problem.
>> + *
>> + * Retain small EfiMemoryMappedIO regions because on some platforms, these
>> + * describe non-window space that's included in host bridge _CRS. If we
>> + * assign that space to PCI devices, they don't work.
>> + */
>> +static void __init efi_remove_e820_mmio(void)
>> +{
>> + efi_memory_desc_t *md;
>> + u64 size, start, end;
>> + int i = 0;
>> +
>> + for_each_efi_memory_desc(md) {
>> + if (md->type == EFI_MEMORY_MAPPED_IO) {
>> + size = md->num_pages << EFI_PAGE_SHIFT;
>> + if (size >= 256*1024) {
>> + start = md->phys_addr;
>> + end = start + size - 1;
>> + pr_info("Remove mem%02u: MMIO range=[0x%08llx-0x%08llx] (%lluMB) from e820 map\n",
>> + i, start, end, size >> 20);
>> + e820__range_remove(start, size,
>> + E820_TYPE_RESERVED, 1);
>> + } else {
>> + pr_info("Not removing mem%02u: MMIO range=[0x%08llx-0x%08llx] (%lluKB) from e820 map\n",
>> + i, start, end, size >> 10);
>
> The logging in this else is re-using the start and end from the previous section which was actually removed.
>
> E.g. Matt's latest log from:
> https://bugzilla.redhat.com/show_bug.cgi?id=1868899
> has:
>
> [ 0.000000] e820: remove [mem 0xfc800000-0xfe7fffff] reserved
> [ 0.000000] efi: Not removing mem46: MMIO range=[0xfc800000-0xfe7fffff] (4KB) from e820 map
> [ 0.000000] efi: Not removing mem47: MMIO range=[0xfc800000-0xfe7fffff] (32KB) from e820 map
> [ 0.000000] efi: Not removing mem49: MMIO range=[0xfc800000-0xfe7fffff] (8KB) from e820 map
> [ 0.000000] efi: Not removing mem50: MMIO range=[0xfc800000-0xfe7fffff] (4KB) from e820 map
>
> Notice how all the "Not removing ..." lines log the same range as
> the actually removed map entry above them.

I realize the fix is very obvious, but since I just fixed this in my
local tree anyways, here is my fix for this:

--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -331,9 +331,9 @@ static void __init efi_remove_e820_mmio(void)
for_each_efi_memory_desc(md) {
if (md->type == EFI_MEMORY_MAPPED_IO) {
size = md->num_pages << EFI_PAGE_SHIFT;
+ start = md->phys_addr;
+ end = start + size - 1;
if (size >= 256*1024) {
- start = md->phys_addr;
- end = start + size - 1;
pr_info("Remove mem%02u: MMIO range=[0x%08llx-0x%08llx] (%lluMB) from e820 map\n",
i, start, end, size >> 20);
e820__range_remove(start, size,

Regards,

Hans






>> + }
>> + }
>> + i++;
>> + }
>> +}
>> +
>> void __init efi_print_memmap(void)
>> {
>> efi_memory_desc_t *md;
>> @@ -474,6 +518,8 @@ void __init efi_init(void)
>> set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
>> efi_clean_memmap();
>>
>> + efi_remove_e820_mmio();
>> +
>> if (efi_enabled(EFI_DBG))
>> efi_print_memmap();
>> }
>