Re: [PATCH] cgroup/dmem: implement dmem.high soft limit and throttling
From: Maarten Lankhorst
Date: Thu May 21 2026 - 06:27:37 EST
Hello Qiliang,
Den 2026-05-20 kl. 08:07, skrev Qiliang Yuan:
> Introduce the "high" soft limit for the dmem cgroup v2 controller.
> When a cgroup's device memory usage exceeds its high limit, tasks
> belonging to that cgroup are throttled by being forced into a sleep
> before returning to user space, instead of being failed outright
> as with the "max" limit.
>
> Key changes:
> - Add high counter configuration to dmem_cgroup_pool.
> - Add over-high check in the try_charge path and set TIF_NOTIFY_RESUME.
> - Inject the dmem throttling handler into resume_user_mode_work.
> - Implement the handler to perform a 100ms interruptible sleep for
> over-limit tasks.
>
> This mechanism provides smoother over-subscription support for device
> memory resources.
>
> Signed-off-by: Qiliang Yuan <realwujing@xxxxxxxxx>
> ---
> This series introduces the "high" soft limit and associated task
> throttling mechanism to the dmem cgroup v2 controller.
>
> The device memory (VRAM) management currently only supports hard limits
> (max), which leads to immediate allocation failures when reached. This
> can be disruptive for GPU-bound AI workloads. By introducing a soft
> limit, we allow cgroups to exceed their quota temporarily while
> applying backpressure via task throttling before the process returns
> to user space.
>
> The mechanism is inspired by the memory cgroup's high limit:
> - When usage > high, the task is marked with TIF_NOTIFY_RESUME.
> - Upon returning to user space, it triggers a 100ms sleep.
> - This provides a smoother over-subscription model for GPU resources.
>
> Qiliang Yuan (1):
>
> cgroup/dmem: implement dmem.high soft limit and throttling
> ---
> To: Maarten Lankhorst <dev@xxxxxxxxxxxx>
> To: Maxime Ripard <mripard@xxxxxxxxxx>
> To: Natalie Vock <natalie.vock@xxxxxx>
> To: Tejun Heo <tj@xxxxxxxxxx>
> To: Johannes Weiner <hannes@xxxxxxxxxxx>
> To: Michal Koutný <mkoutny@xxxxxxxx>
> Cc: cgroups@xxxxxxxxxxxxxxx
> Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> ---
I think the concept of allowing userspace to throttle on high
is interesting.
It's the approach I'm more worried about. I believe that it's
better if we punish exceeding their high limit by preferentially
evicting those.
It would make eviction run in 3 passes on the affected cgroup tree:
- Round 1: Clients above their 'high' limit
- Round 2: Clients above their 'low/min' limits
- Round 3: Clients at or below their 'low' limit
And the same client's cgroup, below 'min' limit as well.
I'm open for other ideas as well. Perhaps a flag that would allow
allocation or binding to an address space to fail if it would need
to evict, or a notification sent to the affected client that they
went over high.
Have you tried any other approaches before this one?
Kind regards,
~Maarten Lankhorst