Re: [PATCH 2/3] drm/amdgpu: block THP for HSA userptr notifiers
From: Christian König
Date: Thu Jun 25 2026 - 08:37:17 EST
On 6/25/26 12:59, Yitao Jiang wrote:
> [Some people who received this message don't often get email from jytscientist@xxxxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> HSA userptr buffer objects are used by KFD compute queues. On systems
> where the GPU cannot reliably tolerate a CPU THP remap of an active
> userptr range, allowing khugepaged or MADV_COLLAPSE to replace PTE
> mappings with a PMD mapping can leave later GPU work failing
> asynchronously.
Absolutely clear NAK to this.
That largely sounds like it just work around some issue and is not really a doable fix.
Regards,
Christian.
>
> Register HSA userptr interval notifiers with
> MMU_INTERVAL_NOTIFIER_BLOCK_THP. GFX userptrs keep the existing
> notifier path and do not opt in.
>
> Assisted-by: OpenAI-Codex:GPT-5.5
> Signed-off-by: Yitao Jiang <jytscientist@xxxxxxxxxxx>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 25 +++++++++++++++++--------
> 1 file changed, 17 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
> index 99bc9ad67..c0b36164c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
> @@ -44,6 +44,7 @@
> */
>
> #include <linux/firmware.h>
> +#include <linux/mm.h>
> #include <linux/module.h>
> #include <drm/drm.h>
>
> @@ -130,16 +131,24 @@ static const struct mmu_interval_notifier_ops amdgpu_hmm_hsa_ops = {
> */
> int amdgpu_hmm_register(struct amdgpu_bo *bo, unsigned long addr)
> {
> + struct mm_struct *mm = current->mm;
> + unsigned long size = amdgpu_bo_size(bo);
> int r;
>
> - if (bo->kfd_bo)
> - r = mmu_interval_notifier_insert(&bo->notifier, current->mm,
> - addr, amdgpu_bo_size(bo),
> - &amdgpu_hmm_hsa_ops);
> - else
> - r = mmu_interval_notifier_insert(&bo->notifier, current->mm, addr,
> - amdgpu_bo_size(bo),
> - &amdgpu_hmm_gfx_ops);
> + if (unlikely(!mm))
> + return -ESRCH;
> +
> + if (bo->kfd_bo) {
> + mmap_write_lock(mm);
> + r = mmu_interval_notifier_insert_locked_flags(&bo->notifier, mm,
> + addr, size,
> + &amdgpu_hmm_hsa_ops,
> + MMU_INTERVAL_NOTIFIER_BLOCK_THP);
> + mmap_write_unlock(mm);
> + } else {
> + r = mmu_interval_notifier_insert(&bo->notifier, mm, addr, size,
> + &amdgpu_hmm_gfx_ops);
> + }
> if (r)
> /*
> * Make sure amdgpu_hmm_unregister() doesn't call
> --
> 2.53.0
>