Re: [RFC] mm: support multi_freearea to the reduction of external fragmentation

From: David Hildenbrand
Date: Tue Apr 27 2021 - 08:46:13 EST


On 26.04.21 12:19, lipeifeng@xxxxxxxx wrote:
Hi David Hildenbrand <mailto:david@xxxxxxxxxx>:

>> And you don't mention what the baseline configuration was. For example,
>> how was compaction configured?
>> Just to clarify, what is monkey?
>> Monkey HTTP server? MonkeyTest disk benchmark? UI/Application Exerciser
>> Monkey?
-------------------------------------------------------------------------------------
I am sorry that i didn't  give a clear explanation about Monkey.
It meant  "UI/Application Exerciser Monkey" from google.

Excuse me, let me introduce our test:


Thanks for more details on the test.

1. record COMPACT_STALL
We tested the patch on linux-4.4/linux-4.9/linux-4.14/linux-4.19 and the
results shows that the patch is effective in reducing COMPACTSTALL.
    - monkey for 12 hours.
    - record COMPACTSTALL after test.

Test-result: reduced COMPACTSTALL by 95.6% with the patch.
(the machine with 4 gigabytes of physical memery and in linux-4.19.)
---------------------------------
                     |   COMPACTSTALL
---------------------------------
   ori              |     2189
---------------------------------
optimization |      95
---------------------------------

I fully agree with the value of compaction, but compaction also bring cpu
consumption and will increase the time of alloc_stall. So if we can let more
free high-orders-pages in buddy instead of signal pages, it will decrease
COMPACT_STALL and speed up memory allocation.

Okay, but then I assume the target goal of your patch set is to minimize CPU consumption/allocation stall time when allocating larger order pages.

Currently you state "the probablity of high-order-pages allocation would be increased significantly", but I assume that's then not 100% correct. What you measure is the stall time to allocate higher order pages, not that you can allocate them.


2. record the speed of the high-orders-pages allocation(order=4 and order = 8)
Before and after optimization, we tested the speed of the high-orders-pages allocation
after 120-hours-Monkey in 10 Android mobile phones. and the result show that
the speed has been increased by more than 18%.

Also, we do some test designed by us:
(the machine with 4 gigabytes of physical memery and in linux-4.19.)
model the usage of users, and constantly start and
operate the diffrent application for 120h, and we record COMPACT_STALL is decreased by
90+% and speed of the high-orders-pages is increaed by 15+%.

Okay, again, this is then some optimization for allocation speed; which makes it less attractive IMHO (at least for more invasive changes), because I suspect this mostly helps in corner cases (Monkey benchmarks corner cases AFAIU).


and I have some question, i hope you can guide me if when you are free.
1) What is the compaction configured?
    Dost it meant the members in zone? like as follows:
    unsigned int compact_considered;
    unsigned int compact_defer_shift;
    int compact_order_failed;
    bool compact_blockskip_failed;
    Or the some Macro variable? like as follows:
    PAGE_ALLOC_COSTLY_ORDER = 3
    MIN_COMPACT_PRIORITY = 1
    MAX_COMPACT_RETRIES = 16


Rather if you have proactive compaction (/proc/sys/vm/compaction_proactiveness). But I assume because you're messing with older kernels, that you didn't compare against that yet. Would be worth a comparison.

1) multi freearea (which might
>> be problematic with sparcity)
2) Can you pls tell me what is soarcity and what is the impact of this?
    and whether there are some documents about it?

Essentially CONFIG_SPARSEMEM, whereby we can have huge holes in physical memory layout and memory areas coming/going with memory hot(un)plug. Usually we manage all metadata per section. For example, pageblocks are allocated per section. We avoid arrays that depend on the initial/maximum physical memory size.

--
Thanks,

David / dhildenb