Re: [PATCH] mm, proc: collect percpu free pages into the free pages

From: Huang, Ying
Date: Wed Sep 11 2024 - 01:41:04 EST


mawupeng <mawupeng1@xxxxxxxxxx> writes:

> On 2024/9/4 15:28, Michal Hocko wrote:
>> On Wed 04-09-24 14:49:20, mawupeng wrote:
>>>
>>>
>>> On 2024/9/3 16:09, Michal Hocko wrote:
>>>> On Tue 03-09-24 09:50:48, mawupeng wrote:
>>>>>> Drain remote PCP may be not that expensive now after commit 4b23a68f9536
>>>>>> ("mm/page_alloc: protect PCP lists with a spinlock"). No IPI is needed
>>>>>> to drain the remote PCP.
>>>>>
>>>>> This looks really great, we can think a way to drop pcp before goto slowpath
>>>>> before swap.
>>>>
>>>> We currently drain after first unsuccessful direct reclaim run. Is that
>>>> insufficient?
>>>
>>> The reason i said the drain of pcp is insufficient or expensive is based
>>> on you comment[1] :-). Since IPIs is not requiered since commit 4b23a68f9536
>>> ("mm/page_alloc: protect PCP lists with a spinlock"). This could be much
>>> better.
>>>
>>> [1]: https://lore.kernel.org/linux-mm/ZWRYZmulV0B-Jv3k@tiehlicka/
>>
>> there are other reasons I have mentioned in that reply which play role
>> as well.
>>
>>>> Should we do a less aggressive draining sooner? Ideally
>>>> restricted to cpus on the same NUMA node maybe? Do you have any specific
>>>> workloads that would benefit from this?
>>>
>>> Current the problem is amount the pcp, which can increase to 4.6%(24644M)
>>> of the total 512G memory.
>>
>> Why is that a problem?
>
> MemAvailable
> An estimate of how much memory is available for starting new
> applications, without swapping. Calculated from MemFree,
> SReclaimable, the size of the file LRU lists, and the low
> watermarks in each zone.
>
> The PCP memory is essentially available memory and will be reclaimed before OOM.
> In essence, it is not fundamentally different from reclaiming file pages, as both
> are reclaimed within __alloc_pages_direct_reclaim. Therefore, why shouldn't it be
> included in MemAvailable to avoid confusion.
>
> __alloc_pages_direct_reclaim
> __perform_reclaim
> if (!page && !drained)
> drain_all_pages(NULL);
>
>
>> Just because some tools are miscalculating memory
>> pressure because they are based on MemAvailable? Or does this lead to
>> performance regressions on the kernel side? In other words would the
>> same workload behaved better if the amount of pcp-cache was reduced
>> without any userspace intervention?

Back to the original PCP cache issue. I want to make sure that whether
PCP auto-tuning works properly on your system. If so, the total PCP
pages should be less than the sum of the low watermark of zones. Can
you verify that first?

--
Best Regards,
Huang, Ying