Re: [PATCH 0/4] PCI: Continue E820 vs host bridge window saga
From: Hans de Goede
Date: Wed Dec 07 2022 - 10:32:25 EST
Hi Bjorn,
On 12/4/22 10:13, Hans de Goede wrote:
<snip>
>>> 2. I am afraid that now allowing PCI MMIO space to be allocated
>>> in regions marked as EfiMemoryMappedIO will cause regressions
>>> on some systems. Specifically when I tried something similar
>>> the last time I looked at this (using the BIOS date cut-off
>>> approach IIRC) there was a suspend/resume regression on
>>> a Lenovo ThinkPad X1 carbon (20A7) model:
>>>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=2029207
>>>
>>> Back then I came to the conclusion that the problem is that not
>>> avoiding the EfiMemoryMappedIO regions caused PCI MMIO space to
>>> be allocated in the 0xdfa00000 - 0xdfa10000 range which is
>>> listed in the EFI memmap as:
>>>
>>> [ 0.000000] efi: mem46: [MMIO |RUN| | | | | | | | | | | | | ] range=[0x00000000dfa00000-0x00000000dfa0ffff] (0MB)
>>>
>>> And with current kernels with the extra logging added for this
>>> the following is logged related to this:
>>>
>>> [ 0.326504] acpi PNP0A08:00: clipped [mem 0xdfa00000-0xfebfffff window] to [mem 0xdfa10000-0xfebfffff window] for e820 entry [mem 0xdceff000-0xdfa0ffff]
>>>
>>> I believe patch 1/4 of this set will make this clipping go away,
>>> re-introducing the suspend/resume problem.
>>
>> Yes, I'm afraid you're right. Comparing the logs at comment #31
>> (fails) and comment #38 (works):
>>
>> pci_bus 0000:00: root bus resource [mem 0xdfa00000-0xfebfffff window]
>> pci 0000:00:1c.0: BAR 14: assigned [mem 0xdfa00000-0xdfbfffff] fails
>> pci 0000:00:1c.0: BAR 14: assigned [mem 0xdfb00000-0xdfcfffff] works
>>
>> Since 0xdfa00000 is included in the host bridge _CRS, but isn't
>> usable, my guess is this is a _CRS bug.
>
> Ack.
>
> So I was thinking to maybe limit the removal of EfiMemoryMappedIO
> regions from the E820 map if they are big enough to cause troubles?
>
> Looking at the EFI map MMIO regions on this Lenovo ThinkPad X1 carbon
> (20A7) model, they are tiny. Where as the ones which we know cause
> problems are huge. So maybe add a bit of heuristics to patch 1/4 based
> on the EfiMemoryMappedIO region size and only remove the big ones
> from the E820 map ?
>
> I know that adding heuristics like this always feels a bit wrong,
> because you end up putting a somewhat arbitrary cut off point in
> the code on which to toggle behavior on/off, but I think that in
> this case it should work nicely given how huge the EfiMemoryMappedIO
> regions which are actually causing problems are.
One of the original reporters of the suspend/resume problem has gotten
back to me in:
https://bugzilla.redhat.com/show_bug.cgi?id=2029207
And they are willing to run some more tests. I could give them
a 6.0 kernel with the 4 patches from this series added to test,
but I think we both agree that it is very likely that the suspend/resume
problem will resurface, so I'm not sure how useful that would be ?
>> Or maybe BIOS is using the producer/consumer bit in _CRS to identify
>> this as register space as opposed to a window? I thought we couldn't
>> rely on that bit, but it's been a long time since I looked at it. An
>> acpidump might have a clue.
I have asked the reporter of;
https://bugzilla.redhat.com/show_bug.cgi?id=2029207
To grab any acpidump.
Regards,
Hans
p.s.
Looking at the efi=debug output from:
https://bugzilla-attachments.redhat.com/attachment.cgi?id=1861035
The small MMIO regions which we most honor as reserved do
have the "RUN" (runtime) flag set in the EFI mmap.
But I'm afraid that the same applies to the troublesome
MMIO EFI regions which cause the failures to assign
PCI regions for devices not setup by the firmware:
https://bugzilla-attachments.redhat.com/attachment.cgi?id=1861407
So that "RUN" flag is of no use.