Re: [RFC -next] memcg: Optimize creation performance when LRU_GEN is enabled
From: Chen Ridong
Date: Thu Dec 04 2025 - 08:01:55 EST
On 2025/12/4 20:59, Chen Ridong wrote:
>
>
> On 2025/11/27 17:04, Chen Ridong wrote:
>>
>>
>> On 2025/11/27 1:15, Johannes Weiner wrote:
>>> On Wed, Nov 19, 2025 at 08:37:22AM +0000, Chen Ridong wrote:
>>>> From: Chen Ridong <chenridong@xxxxxxxxxx>
>>>>
>>>> With LRU_GEN=y and LRU_GEN_ENABLED=n, a performance regression occurs
>>>> when creating a large number of memory cgroups (memcgs):
>>>>
>>>> # time mkdir testcg_{1..10000}
>>>>
>>>> real 0m7.167s
>>>> user 0m0.037s
>>>> sys 0m6.773s
>>>>
>>>> # time mkdir testcg_{1..20000}
>>>>
>>>> real 0m27.158s
>>>> user 0m0.079s
>>>> sys 0m26.270s
>>>>
>>>> In contrast, with LRU_GEN=n, creation of the same number of memcgs
>>>> performs better:
>>>>
>>>> # time mkdir testcg_{1..10000}
>>>>
>>>> real 0m3.386s
>>>> user 0m0.044s
>>>> sys 0m3.009s
>>>>
>>>> # time mkdir testcg_{1..20000}
>>>>
>>>> real 0m6.876s
>>>> user 0m0.075s
>>>> sys 0m6.121s
>>>>
>>>> The root cause is that lru_gen node onlining uses hlist_nulls_add_tail_rcu,
>>>> which traverses the entire list to find the tail. This traversal scales
>>>> with the number of memcgs, even when LRU_GEN is runtime-disabled.
>>>
>>> Can you please look into removing the memcg LRU instead?
>>>
>>
>> Thanks Johannes, this is indeed a promising approach.
>>
>> The memcg LRU was originally designed exclusively for global reclaim scenarios. Before we move
>> forward with its removal, I'd like to hear Yu's thoughts on this.
>>
>> Hello Yu,
>>
>> Do you have any opinions on removing the memcg LRU?
>>
>
> Hello Johannes and Yu,
>
> I've sent patches to remove the memcg LRU and replace it with mem_cgroup_iter.
> I would appreciate it if you could take a look when you have time.
>
Just adding the link to the patch series:
https://lore.kernel.org/cgroups/20251204123124.1822965-1-chenridong@xxxxxxxxxxxxxxx/T/#t
--
Best regards,
Ridong