Re: [PATCH 1/1] mm: vmalloc: Optimize vmap_lazy_nr arithmetic when purging each vmap_area

From: Andrew Morton
Date: Fri Aug 30 2024 - 20:33:45 EST


On Thu, 29 Aug 2024 21:06:33 +0800 Adrian Huang <adrianhuang0701@xxxxxxxxx> wrote:

> From: Adrian Huang <ahuang12@xxxxxxxxxx>
>
> When running the vmalloc stress on a 448-core system, observe the average
> latency of purge_vmap_node() is about 2 seconds by using the eBPF/bcc
> 'funclatency.py' tool [1].
>
> ...
>
> 3) The data in column "w/o patch" and "w/ patch"
> * Unit: micro seconds (us)
> * Each data is the average of 3-time measurements
>
> System w/o patch (us) w/ patch (us) Improvement (%)
> --------------- -------------- ------------- -------------
> 72-core server 2194 14 99.36%
> 192-core server 143799 1139 99.21%
> 448-core server 1992122 6883 99.65%
>

Holy cow. I'll add cc:stable to this. Partly because it fixes a
softlockup, partly because holy cow.