Re: [PATCH 0/4] PCI: Continue E820 vs host bridge window saga

From: Hans de Goede
Date: Sun Dec 04 2022 - 04:30:54 EST


Hi Bjorn,

On 12/3/22 18:57, Bjorn Helgaas wrote:
> On Sat, Dec 03, 2022 at 01:44:10PM +0100, Hans de Goede wrote:
>> Hi Bjorn,
>>
>> On 12/2/22 22:18, Bjorn Helgaas wrote:
>>> From: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
>>>
>>> When allocating space for PCI BARs, Linux avoids allocating space mentioned
>>> in the E820 map. This was originally done by 4dc2287c1805 ("x86: avoid
>>> E820 regions when allocating address space") to work around BIOS defects
>>> that included unusable space in host bridge _CRS.
>>>
>>> Some recent machines use EfiMemoryMappedIO for PCI MMCONFIG and host bridge
>>> apertures, and bootloaders and EFI stubs convert those to E820 regions,
>>> which means we can't allocate space for hot-added PCI devices (often a
>>> dock) or for devices the BIOS didn't configure (often a touchpad)
>>>
>>> The current strategy is to add DMI quirks that disable the E820 filtering
>>> on these machines and to disable it entirely starting with 2023 BIOSes:
>>>
>>> d341838d776a ("x86/PCI: Disable E820 reserved region clipping via quirks")
>>> 0ae084d5a674 ("x86/PCI: Disable E820 reserved region clipping starting in 2023")
>>>
>>> But the quirks are problematic because it's really hard to list all the
>>> machines that need them.
>>>
>>> This series is an attempt at a more generic approach. I'm told by firmware
>>> folks that EfiMemoryMappedIO means "the OS should map this area so EFI
>>> runtime services can use it in virtual mode," but does not prevent the OS
>>> from using it.
>>>
>>> The first patch removes any EfiMemoryMappedIO areas from the E820 map.
>>> This doesn't affect any virtual mapping of those areas (that would have to
>>> be done directly from the EFI memory map) but it means Linux can allocate
>>> space for PCI MMIO.
>>>
>>> The rest are basically cosmetic log message changes.
>>
>> Thank you for working on this. I'm a bit worried about this series though.
>>
>> The 2 things which I worry about are:
>>
>>
>> 1. I think this will not help when people boot in BIOS (CSM) mode rather
>> then UEFI mode which quite a few Linux users still do because they learned
>> to do this years ago when Linux EFI support (and EFI fw itself) was still
>> a bit in flux.
>>
>> IIRC from the last time we looked at this in CSM mode the BIOS itself
>> translates the EfiMemoryMappedIO areas to reserved E820 regions. So when
>> people use the BIOS CSM mode to boot, then this patch will not help
>> since the kernel lacks the info to do the translation.
>
> Right, if BIOS CSM puts EfiMemoryMappedIO in the E820 map the same way
> bootloaders do, and the kernel doesn't have the EFI memory map, this
> series won't help.

So I just got the requested dmesg in BIOS CSM mode from:
https://bugzilla.redhat.com/show_bug.cgi?id=1868899

And it says:

[ 0.000000] BIOS-e820: [mem 0x000000004bc50000-0x00000000cfffffff] reserved
[ 0.316140] pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window]

So I'm afraid that I remembered correctly and the CSM adds
the EfiMemoryMappedIO regions to the E820 map as reserved :(

So as you said, this series won't help for people booting in
BIOS compatibility mode. Which means that we should at least keep
the current list of no_e820 quirks to avoid regressing those models
when booted in BIOS compatibility mode.

And maybe still add at least the Clevo model for which I recently
submitted a new no_e820 quirk so that that will work in BIOS CSM
mode too ?

Note I know you did not propose to drop the quirks in this series,
just covering all the bases here.

Regards,

Hans