Re: [RFC PATCH] Limit reclaim to avoid TTM desktop stutter under mem pressure

From: Christian König

Date: Tue Apr 07 2026 - 03:43:55 EST

On 4/6/26 23:02, Matthew Brost wrote:
> On Tue, Mar 31, 2026 at 10:08:58PM -0400, Daniel Colascione wrote:
...
>> -
>> - /*
>> - * Do not add latency to the allocation path for allocations orders
>> - * device tolds us do not bring them additional performance gains.
>> - */
>> - if (beneficial_order && order > beneficial_order)
>> - gfp_flags &= ~__GFP_DIRECT_RECLAIM;
>> + if (beneficial_order && order > beneficial_order)
>> + gfp_flags &= ~__GFP_DIRECT_RECLAIM;
>> + if (order > max_reclaim_order)
>> + gfp_flags &= ~__GFP_RECLAIM;
>
> I’m not very familiar with this code, but at first glance it doesn’t
> seem quite right.
>
> Would setting Xe’s beneficial to 9, similar to AMD’s, along with this
> diff, help?

No, not really. The problem is that giving 9 as beneficial order only saves us avoiding direct reclaim for 10 (>=11 is usually not used in a x86 linux kernel anyway).

>
> If I’m understanding this correctly, we would try a single allocation
> attempt with __GFP_DIRECT_RECLAIM cleared for the size we care about,
> still attempt allocations from the pools, and then finally fall back to
> allocating single pages one at a time.

Well the code is a bit broken, but the general idea is not so bad.

What we could do is to use beneficial_order as sweet spot and set __GFP_DIRECT_RECLAIM only for the allocations with that order.

This would skip setting it for order 1..8, which are nice to have as well but not so necessary that we always need to trigger reclaim for them.

Regards,
Christian.

>
> Matt
>
> diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
> index aa41099c5ecf..f1f430aba0c1 100644
> --- a/drivers/gpu/drm/ttm/ttm_pool.c
> +++ b/drivers/gpu/drm/ttm/ttm_pool.c
> @@ -714,6 +714,7 @@ static int __ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
> struct ttm_pool_alloc_state *alloc,
> struct ttm_pool_tt_restore *restore)
> {
> + const unsigned int beneficial_order = ttm_pool_beneficial_order(pool);
> enum ttm_caching page_caching;
> gfp_t gfp_flags = GFP_USER;
> pgoff_t caching_divide;
> @@ -757,7 +758,8 @@ static int __ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt,
> if (!p) {
> page_caching = ttm_cached;
> allow_pools = false;
> - p = ttm_pool_alloc_page(pool, gfp_flags, order);
> + if (!order || order >= beneficial_order)
> + p = ttm_pool_alloc_page(pool, gfp_flags, order);
> }
> /* If that fails, lower the order if possible and retry. */
> if (!p) {
>
>
>> + }
>>
>> if (!ttm_pool_uses_dma_alloc(pool)) {
>> p = alloc_pages_node(pool->nid, gfp_flags, order);