Re: [PATCH] mm, proc: collect percpu free pages into the free pages

From: mawupeng
Date: Tue Sep 10 2024 - 08:11:48 EST




On 2024/9/4 15:28, Michal Hocko wrote:
> On Wed 04-09-24 14:49:20, mawupeng wrote:
>>
>>
>> On 2024/9/3 16:09, Michal Hocko wrote:
>>> On Tue 03-09-24 09:50:48, mawupeng wrote:
>>>>> Drain remote PCP may be not that expensive now after commit 4b23a68f9536
>>>>> ("mm/page_alloc: protect PCP lists with a spinlock"). No IPI is needed
>>>>> to drain the remote PCP.
>>>>
>>>> This looks really great, we can think a way to drop pcp before goto slowpath
>>>> before swap.
>>>
>>> We currently drain after first unsuccessful direct reclaim run. Is that
>>> insufficient?
>>
>> The reason i said the drain of pcp is insufficient or expensive is based
>> on you comment[1] :-). Since IPIs is not requiered since commit 4b23a68f9536
>> ("mm/page_alloc: protect PCP lists with a spinlock"). This could be much
>> better.
>>
>> [1]: https://lore.kernel.org/linux-mm/ZWRYZmulV0B-Jv3k@tiehlicka/
>
> there are other reasons I have mentioned in that reply which play role
> as well.
>
>>> Should we do a less aggressive draining sooner? Ideally
>>> restricted to cpus on the same NUMA node maybe? Do you have any specific
>>> workloads that would benefit from this?
>>
>> Current the problem is amount the pcp, which can increase to 4.6%(24644M)
>> of the total 512G memory.
>
> Why is that a problem?

MemAvailable
An estimate of how much memory is available for starting new
applications, without swapping. Calculated from MemFree,
SReclaimable, the size of the file LRU lists, and the low
watermarks in each zone.

The PCP memory is essentially available memory and will be reclaimed before OOM.
In essence, it is not fundamentally different from reclaiming file pages, as both
are reclaimed within __alloc_pages_direct_reclaim. Therefore, why shouldn't it be
included in MemAvailable to avoid confusion.

__alloc_pages_direct_reclaim
__perform_reclaim
if (!page && !drained)
drain_all_pages(NULL);


> Just because some tools are miscalculating memory
> pressure because they are based on MemAvailable? Or does this lead to
> performance regressions on the kernel side? In other words would the
> same workload behaved better if the amount of pcp-cache was reduced
> without any userspace intervention?