Re: commit 7ffb791423c7 breaks steam game
From: Balbir Singh
Date: Wed Mar 26 2025 - 20:57:57 EST
On 3/27/25 09:58, Linus Torvalds wrote:
> On Wed, 26 Mar 2025 at 15:00, Bert Karwatzki <spasswolf@xxxxxx> wrote:
>>
>> As Balbir Singh found out this memory comes from amdkfd
>> (kgd2kfd_init_zone_device()) with CONFIG_HSA_AMD_SVM=y. The memory gets placed
>> by devm_request_free_mem_region() which places the memory at the end of the
>> physical address space (DIRECT_MAP_PHYSMEM_END). DIRECT_MAP_PHYSMEM_END changes
>> when using nokaslr and so the memory shifts.
>
> So I just want to say that having followed the thread as a spectator,
> big kudos to everybody involved in this thing. Particularly to you,
> Bart, for all your debugging and testing, and to Balbir for following
> up and figuring it out.
>
> Because this was a strange one.
>
Thanks!
>> One can work around this by removing the GFR_DESCENDING flag from
>> devm_request_free_mem_region() so the memory gets placed right after the other
>> resources:
>
> I worry that there might be other machines where that completely breaks things.
>
> There are various historical reasons why we look for addresses in high
> regions, ie on machines where there are various hidden IO regions that
> aren't enumerated by e280 and aren't found by our usual PCI BAR
> discovery because they are special hidden ones.
>
> So then users of [devm_]request_free_mem_region() might end up getting
> allocated a region that has some magic system resource in it.
>
> And no, this shouldn't happen on any normal machine, but it has
> definitely been a thing in the past.
>
> So I'm very happy that you guys figured out what ended up happening,
> but I'm not convinced that the devm_request_free_mem_region()
> workaround is tenable.
>
> So I think it needs to be more targeted to the HSA_AMD_SVM case than
> touch the devm_request_free_mem_region() logic for everybody.
>
I agree with your assessment, I was looking at whether bumping up
max_pfn for DEVICE_PRIVATE memory mappings via add_pages() is the
right thing to do, but I have not yet completed my code search.
>From my understanding, max_pfn should be used as the end of system
RAM and direct_map_physmem_end as end of addressable memory. I proposed
not updating max_pfn for zone device based add_pages() on x86 via a test
patch that worked for Bert. This allows HSA_AMD_SVM, nokaslr, PCI_P2PDMA
to all co-exist, but I need to audit all of the max_pfn usage and assumptions.
Balbir Singh