Re: [PATCH 1/2] cgroup/dmem: add per-region event counters
From: Maarten Lankhorst
Date: Tue Jun 30 2026 - 17:10:04 EST
Hello,
On 6/25/26 12:21, Hongfu Li wrote:
> Hi,
>
> On 6/25/26 4:57 PM, Natalie Vock wrote:
>> Hi,
>>
>> On 6/25/26 04:10, Hongfu Li wrote:
>>> Hi, Tejun
>>> Thanks for the review comments.
>>>
>>>>> Add dmem.events to report hierarchical low/max event counts per DMEM
>>>>> region. Increment counters on dmem.max allocation failures and
>>>>> dmem.low protection events. The file is available for non-root cgroups
>>>>> only.
>>>>
>>>> Please don't double space in descs or comments. Also, maybe it's obvious but
>>>> it'd help if you list why and how this is useful. Why do we want to add
>>>> this?
>>>
>>> I'll fix the double spacing in the commit message and comments.
>>>
>>> As for the motivation: dmem already exposes per-region limits and current
>>> usage, but not how often those limits actually matter at runtime. Without
>>> event counters, it's hard to tell whether allocation failures come from
>>> this cgroup, a parent limit, or pressure elsewhere in the hierarchy.
>>> dmem.events provides that visibility for tuning dmem.low/dmem.max and
>>> diagnosing recurring device memory pressure.
>>
>> Shouldn't you be able to deduce this rather trivially from just looking at the current usage together with the low/max limits you already set? I'm not sure I really see anything this events file provides that analysis of current usage and set limits doesn't? If your usage is highly variable, the separately-developed dmem.peak file might also suit your needs, but still, not sure what you can do with dmem.events that you can't already do with these tools.
> Thanks for the question.
>
> Besides exposing counters, dmem.events notifies userspace on changes via
> cgroup_file_notify(). This allows tools to monitor limit-related events
> (for example, allocation failures or low-protection fallbacks) asynchronously,
> without the need to periodically poll dmem.current against the limits. While
> you could infer some conditions from current usage and limits, polling is
> inefficient and cannot capture transient events in real time. dmem.peak only
> records the highest usage, not these specific events.
>
> So dmem.events provides both lower overhead and richer, actionable information.
Agreed, they're separate but both useful.
The peak tells you what the maximum memory consumption is.
The events are sent when a limit is reached, but more will also count how often limit is reached and reclaim needs to happen.
So if you have 4 cgroups, and 1 of them sends a lot of events, that tells you that you may want
to increase that cgroup's limits dynamically to have a more performant system.
Kind regards,
~Maarten Lankhorst