Re: [PATCH 3/5] drm/amdkfd: use vma_is_stack() and vma_is_heap()

From: Felix Kuehling
Date: Fri Jul 14 2023 - 11:09:49 EST


Am 2023-07-14 um 10:26 schrieb Vlastimil Babka:
On 7/12/23 18:24, Felix Kuehling wrote:
Allocations in the heap and stack tend to be small, with several
allocations sharing the same page. Sharing the same page for different
allocations with different access patterns leads to thrashing when we
migrate data back and forth on GPU and CPU access. To avoid this we
disable HMM migrations for head and stack VMAs.
Wonder how well does it really work in practice? AFAIK "heaps" (malloc())
today uses various arenas obtained by mmap() and not a single brk() managed
space anymore? And programs might be multithreaded, thus have multiple
stacks, while vma_is_stack() will recognize only the initial one...

Thanks for these pointers. I have not heard of such problems with mmap arenas and multiple thread stacks in practice. But I'll keep it in mind in case we observe unexpected thrashing in the future. FWIW, we once had the opposite problem of a custom malloc implementation that used sbrk for very large allocations. This disabled migrations of large buffers unexpectedly.

I agree that eventually we'll want a more dynamic way of detecting and suppressing thrashing that's based on observed memory access patterns. Getting this right is probably trickier than it sounds, so I'd prefer to have some more experience with real workloads to use as benchmarks. Compared to other things we're working on, this is fairly low on our priority list at the moment. Using the VMA flags is a simple and effective method for now, at least until we see it failing in real workloads.

Regards,
  Felix



Vlastimil

Regards,
  Felix


Am 2023-07-12 um 10:42 schrieb Christoph Hellwig:
On Wed, Jul 12, 2023 at 10:38:29PM +0800, Kefeng Wang wrote:
Use the helpers to simplify code.
Nothing against your addition of a helper, but a GPU driver really
should have no business even looking at this information..