Re: [GIT PULL] arm64 fix for 5.14

From: David Hildenbrand
Date: Tue Aug 31 2021 - 15:16:35 EST


On 31.08.21 15:31, Will Deacon wrote:
[+David]

On Fri, Aug 27, 2021 at 10:16:27AM -0700, Linus Torvalds wrote:
On Fri, Aug 27, 2021 at 10:10 AM Christoph Hellwig <hch@xxxxxx> wrote:

They CCed me on their earlier discussion, but I did not catch up on it
until you responded to the pull request If I understood it correct it
was about a platform device mapping a MMIO region (like a PCI bar),
but something about section alignment cause pfn_valid to mistrigger.

Yeah, so I can easily see the maxpfn numbers can easily end up being
rounded up to a whole memory section etc.

I think my suggested solution should JustWork(tm) - exactly because if
the area is then in that "this pfn is valid" area, it will
double-check the actual underlying page.

I think the pitfall there is that the 'struct page' might well exist,
but isn't necessarily initialised with anything meaningful. I remember
seeing something like that in the past (I think for "no-map" memory) and
David's reply here:

https://lore.kernel.org/r/aff3942e-b9ce-5bae-8214-0e5d89cd071c@xxxxxxxxxx

hints that there are still gotchas with looking at the page flags for
pages if the memory is offline or ZONE_DEVICE.

Don't get me wrong, I'd really like to use the generic code here as I
think it would help to set expectations around what pfn_valid() actually
means, I'm just less keen on the try-it-and-see-what-breaks approach
given how sensitive it is to the layout of the physical memory map.

That said, I think x86 avoids the problem another way - by just making
sure max_pfn is exact. That works too, as long as there are no holes
in the RAM map that might be used for PCI BAR's.

So I think arm could fix it that way too, depending on their memory layout.

The physical memory map is the wild west, unfortunately. It's one of the
things where everybody does something different and it's very common to see
disjoint banks of memory placed seemingly randomly around.

The resource tree is usually the best place to really identify what's system RAM and what's not IIRC. memblock should work on applicable archs as well. Identifying ZONE_DEVICE ranges reliably is a different story ...

--
Thanks,

David / dhildenb