Re: [PATCH v6] x86/PCI: Ignore E820 reservations for bridge windows on newer systems

From: Rafael J. Wysocki
Date: Tue Jan 11 2022 - 09:02:43 EST


On Mon, Jan 10, 2022 at 10:25 PM Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
>
> Hi,
>
> On 1/10/22 18:11, Bjorn Helgaas wrote:
> > On Mon, Jan 10, 2022 at 12:41:37PM +0100, Hans de Goede wrote:
> >> Hi All,
> >>
> >> On 12/17/21 15:13, Hans de Goede wrote:
> >>> Some BIOS-es contain a bug where they add addresses which map to system
> >>> RAM in the PCI host bridge window returned by the ACPI _CRS method, see
> >>> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address
> >>> space").
> >>>
> >>> To work around this bug Linux excludes E820 reserved addresses when
> >>> allocating addresses from the PCI host bridge window since 2010.
> >>>
> >>> Recently (2019) some systems have shown-up with E820 reservations which
> >>> cover the entire _CRS returned PCI bridge memory window, causing all
> >>> attempts to assign memory to PCI BARs which have not been setup by the
> >>> BIOS to fail. For example here are the relevant dmesg bits from a
> >>> Lenovo IdeaPad 3 15IIL 81WE:
> >>>
> >>> [mem 0x000000004bc50000-0x00000000cfffffff] reserved
> >>> pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]
> >>>
> >>> The ACPI specifications appear to allow this new behavior:
> >>>
> >>> The relationship between E820 and ACPI _CRS is not really very clear.
> >>> ACPI v6.3, sec 15, table 15-374, says AddressRangeReserved means:
> >>>
> >>> This range of addresses is in use or reserved by the system and is
> >>> not to be included in the allocatable memory pool of the operating
> >>> system's memory manager.
> >>>
> >>> and it may be used when:
> >>>
> >>> The address range is in use by a memory-mapped system device.
> >>>
> >>> Furthermore, sec 15.2 says:
> >>>
> >>> Address ranges defined for baseboard memory-mapped I/O devices, such
> >>> as APICs, are returned as reserved.
> >>>
> >>> A PCI host bridge qualifies as a baseboard memory-mapped I/O device,
> >>> and its apertures are in use and certainly should not be included in
> >>> the general allocatable pool, so the fact that some BIOS-es reports
> >>> the PCI aperture as "reserved" in E820 doesn't seem like a BIOS bug.
> >>>
> >>> So it seems that the excluding of E820 reserved addresses is a mistake.
> >>>
> >>> Ideally Linux would fully stop excluding E820 reserved addresses,
> >>> but then the old systems this was added for will regress.
> >>> Instead keep the old behavior for old systems, while ignoring
> >>> the E820 reservations for any systems from now on.
> >>>
> >>> Old systems are defined here as BIOS year < 2018, this was chosen to make
> >>> sure that E820 reservations will not be used on the currently affected
> >>> systems, while at the same time also taking into account that the systems
> >>> for which the E820 checking was originally added may have received BIOS
> >>> updates for quite a while (esp. CVE related ones), giving them a more
> >>> recent BIOS year then 2010.
> >>>
> >>> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459
> >>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899
> >>> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793
> >>> BugLink: https://bugs.launchpad.net/bugs/1878279
> >>> BugLink: https://bugs.launchpad.net/bugs/1931715
> >>> BugLink: https://bugs.launchpad.net/bugs/1932069
> >>> BugLink: https://bugs.launchpad.net/bugs/1921649
> >>> Cc: Benoit Grégoire <benoitg@xxxxxxxx>
> >>> Cc: Hui Wang <hui.wang@xxxxxxxxxxxxx>
> >>> Cc: stable@xxxxxxxxxxxxxxx
> >>> Reviewed-by: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx>
> >>> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> >>> Acked-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> >>> Signed-off-by: Hans de Goede <hdegoede@xxxxxxxxxx>
> >>> ---
> >>> Changes in v6:
> >>> - Remove the possibility to change the behavior from the commandline
> >>> because of worries that users may use this to paper over other problems
> >>
> >> ping ?
> >
> > Thanks, Hans. Maybe I'm quixotic, but I'm still hoping for an
> > approach based on firmware behavior instead of firmware date. If
> > nobody else tries, I will eventually try myself, but I don't have any
> > ETA.
>
> I really do NOT see how doing a better approach later blocks
> merging the date based fix now ?
>
> The date based approach can simply be replaced by any better
> solution later.

Agreed.

> Can we please merge the date based approach now so peoples broken
> systems get fixed now, rather then at some unknown later time ?

OK, I'll queue it up. Thanks!