Re: commit 7ffb791423c7 breaks steam game
From: Ingo Molnar
Date: Thu Mar 27 2025 - 06:53:30 EST
* Balbir Singh <balbirs@xxxxxxxxxx> wrote:
> > Yes, turning off CONFIG_HSA_AMD_SVM fixes the issue, the strange memory
> > resource
> > afe00000000-affffffffff : 0000:03:00.0
> > is gone.
> >
> > If one would add a max_pyhs_addr argument to devm_request_free_mem_region()
> > (which return the resource addr in kgd2kfd_init_zone_device()) one could keep
> > the memory below the 44bit limit with CONFIG_HSA_AMD_SVM enabled.
> >
>
> Thanks for reporting the result, does this patch work
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 01ea7c6df303..14f42f8012ab 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -968,8 +968,9 @@ int add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages,
> WARN_ON_ONCE(ret);
>
> /* update max_pfn, max_low_pfn and high_memory */
> - update_end_of_memory_vars(start_pfn << PAGE_SHIFT,
> - nr_pages << PAGE_SHIFT);
> + if (!params->pgmap)
> + update_end_of_memory_vars(start_pfn << PAGE_SHIFT,
> + nr_pages << PAGE_SHIFT);
>
> return ret;
> }
>
> It basically prevents max_pfn from moving when the inserted memory is
> zone_device.
>
> FYI: It's a test patch and will still create issues if the amount of
> present memory (physically) is very high, because the driver need to
> enable use_dma32 in that case.
So this patch does the trick for Bert, and I'm wondering what the best
fix here would be overall, because it's a tricky situation.
Am I correct in assuming that with enough physical memory this bug
would trigger, with and without nokaslr?
I *think* the best approach going forward would be to add the above
quirk the the x86 memory setup code, but also issue a kernel warning at
that point with all the relevant information included, so that the
driver's use_dma32 bug can at least be indicated?
That might also trigger for other systems, because if this scenario is
so spurious, I doubt it's the only affected driver ...
Thanks,
Ingo