[PATCH v2 0/4] PCI: Continue E820 vs host bridge window saga

From: Bjorn Helgaas
Date: Thu Dec 08 2022 - 14:03:55 EST


From: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>

When allocating space for PCI BARs, Linux avoids allocating space mentioned
in the E820 map. This was originally done by 4dc2287c1805 ("x86: avoid
E820 regions when allocating address space") to work around BIOS defects
that included unusable space in host bridge _CRS.

Some recent machines use EfiMemoryMappedIO for PCI MMCONFIG and host bridge
apertures, and bootloaders and EFI stubs convert those to E820 regions,
which means we can't allocate space for hot-added PCI devices (often a
dock) or for devices the BIOS didn't configure (often a touchpad)

The current strategy is to add DMI quirks that disable the E820 filtering
on these machines and to disable it entirely starting with 2023 BIOSes:

d341838d776a ("x86/PCI: Disable E820 reserved region clipping via quirks")
0ae084d5a674 ("x86/PCI: Disable E820 reserved region clipping starting in 2023")

But the quirks are problematic because it's really hard to list all the
machines that need them.

This series is an attempt at a more generic approach. I'm told by firmware
folks that EfiMemoryMappedIO means "the OS should map this area so EFI
runtime services can use it in virtual mode," but does not prevent the OS
from using it.

The first patch removes large EfiMemoryMappedIO areas from the E820 map.
This doesn't affect any virtual mapping of those areas (that would have to
be done directly from the EFI memory map) but it means Linux can allocate
space for PCI MMIO.

The rest are basically cosmetic log message changes.

Changes from v1 to v2:
- Remove only large (>= 256KB) EfiMemoryMappedIO areas from E820 to avoid
the Lenovo X1 Carbon suspend/resume problems. This machine includes
64KB of non-window space in the PNP0A03 _CRS, and a corresponding
EfiMemoryMappedIO area seems to be the only clue to avoid it (see
https://bugzilla.redhat.com/show_bug.cgi?id=2029207). Interdiff below.


Bjorn Helgaas (4):
efi/x86: Remove EfiMemoryMappedIO from E820 map
PCI: Skip allocate_resource() if too little space available
x86/PCI: Tidy E820 removal messages
x86/PCI: Fix log message typo

arch/x86/kernel/resource.c | 8 +++++--
arch/x86/pci/acpi.c | 2 +-
arch/x86/platform/efi/efi.c | 46 +++++++++++++++++++++++++++++++++++++
drivers/pci/bus.c | 4 ++++
4 files changed, 57 insertions(+), 3 deletions(-)

--
2.25.1

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 4728f60119da..dee1852e95cd 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -315,8 +315,12 @@ static void __init efi_clean_memmap(void)
* PCI host bridge windows, which means Linux can't allocate BAR space for
* hot-added devices.
*
- * Remove any EfiMemoryMappedIO regions from the E820 map to avoid this
+ * Remove large EfiMemoryMappedIO regions from the E820 map to avoid this
* problem.
+ *
+ * Retain small EfiMemoryMappedIO regions because on some platforms, these
+ * describe non-window space that's included in host bridge _CRS. If we
+ * assign that space to PCI devices, they don't work.
*/
static void __init efi_remove_e820_mmio(void)
{
@@ -327,11 +331,17 @@ static void __init efi_remove_e820_mmio(void)
for_each_efi_memory_desc(md) {
if (md->type == EFI_MEMORY_MAPPED_IO) {
size = md->num_pages << EFI_PAGE_SHIFT;
- start = md->phys_addr;
- end = start + size - 1;
- pr_info("Remove mem%02u: MMIO range=[0x%08llx-0x%08llx] (%lluMB) from e820 map\n",
- i, start, end, size >> 20);
- e820__range_remove(start, size, E820_TYPE_RESERVED, 1);
+ if (size >= 256*1024) {
+ start = md->phys_addr;
+ end = start + size - 1;
+ pr_info("Remove mem%02u: MMIO range=[0x%08llx-0x%08llx] (%lluMB) from e820 map\n",
+ i, start, end, size >> 20);
+ e820__range_remove(start, size,
+ E820_TYPE_RESERVED, 1);
+ } else {
+ pr_info("Not removing mem%02u: MMIO range=[0x%08llx-0x%08llx] (%lluKB) from e820 map\n",
+ i, start, end, size >> 10);
+ }
}
i++;
}