Re: [PATCH] cgroup/dmem: implement dmem.high soft limit and throttling

From: Maarten Lankhorst

Date: Thu May 21 2026 - 06:27:37 EST


Hello Qiliang,

Den 2026-05-20 kl. 08:07, skrev Qiliang Yuan:
> Introduce the "high" soft limit for the dmem cgroup v2 controller.
> When a cgroup's device memory usage exceeds its high limit, tasks
> belonging to that cgroup are throttled by being forced into a sleep
> before returning to user space, instead of being failed outright
> as with the "max" limit.
>
> Key changes:
> - Add high counter configuration to dmem_cgroup_pool.
> - Add over-high check in the try_charge path and set TIF_NOTIFY_RESUME.
> - Inject the dmem throttling handler into resume_user_mode_work.
> - Implement the handler to perform a 100ms interruptible sleep for
> over-limit tasks.
>
> This mechanism provides smoother over-subscription support for device
> memory resources.
>
> Signed-off-by: Qiliang Yuan <realwujing@xxxxxxxxx>
> ---
> This series introduces the "high" soft limit and associated task
> throttling mechanism to the dmem cgroup v2 controller.
>
> The device memory (VRAM) management currently only supports hard limits
> (max), which leads to immediate allocation failures when reached. This
> can be disruptive for GPU-bound AI workloads. By introducing a soft
> limit, we allow cgroups to exceed their quota temporarily while
> applying backpressure via task throttling before the process returns
> to user space.
>
> The mechanism is inspired by the memory cgroup's high limit:
> - When usage > high, the task is marked with TIF_NOTIFY_RESUME.
> - Upon returning to user space, it triggers a 100ms sleep.
> - This provides a smoother over-subscription model for GPU resources.
>
> Qiliang Yuan (1):
>
> cgroup/dmem: implement dmem.high soft limit and throttling
> ---
> To: Maarten Lankhorst <dev@xxxxxxxxxxxx>
> To: Maxime Ripard <mripard@xxxxxxxxxx>
> To: Natalie Vock <natalie.vock@xxxxxx>
> To: Tejun Heo <tj@xxxxxxxxxx>
> To: Johannes Weiner <hannes@xxxxxxxxxxx>
> To: Michal Koutný <mkoutny@xxxxxxxx>
> Cc: cgroups@xxxxxxxxxxxxxxx
> Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> ---

I think the concept of allowing userspace to throttle on high
is interesting.

It's the approach I'm more worried about. I believe that it's
better if we punish exceeding their high limit by preferentially
evicting those.

It would make eviction run in 3 passes on the affected cgroup tree:
- Round 1: Clients above their 'high' limit
- Round 2: Clients above their 'low/min' limits
- Round 3: Clients at or below their 'low' limit

And the same client's cgroup, below 'min' limit as well.

I'm open for other ideas as well. Perhaps a flag that would allow
allocation or binding to an address space to fail if it would need
to evict, or a notification sent to the affected client that they
went over high.

Have you tried any other approaches before this one?

Kind regards,
~Maarten Lankhorst