Re: [RFC PATCH v2 0/4] mm/zsmalloc: reduce zs_free() latency on swap release path

From: Kairui Song

Date: Thu Apr 30 2026 - 04:00:54 EST


On Thu, Apr 30, 2026 at 3:43 PM Wenchao Hao <haowenchao22@xxxxxxxxx> wrote:
> The data I shared earlier was class_idx-in-obj only — no
> deferred freeing at all.
>
> > I couldn't immediately tell by looking at this vs. the cover letter. I wonder
> > what portion of the improvement comes from the deferred freeing?
>
> On top of that, we added deferred freeing in the zsmalloc
> layer (per-cpu page-pool based buffer swap + WQ_UNBOUND
> drain worker). With both class_idx + deferred:
>
> Test 1: concurrent munmap (256MB/process, RPi 4B):
>
> mode Base Deferred Speedup
> single 56.2ms 17.2ms 3.27x
> multi 3p 153.2ms 51.5ms 2.97x
>
> Test 2: single process munmap (various sizes):
>
> size Base Deferred Speedup
> 64MB 15.0ms 4.3ms 3.47x
> 128MB 28.7ms 8.5ms 3.37x
> 192MB 43.2ms 13.0ms 3.32x
> 256MB 57.0ms 17.3ms 3.30x
> 512MB 114.4ms 38.5ms 2.97x

Hi Wenchao,

One concern here is that the total amount of work is unchanged. I mean
you observe speed up because you offloaded the work to an async
worker. But when under pressure these workers could be a larger
burden. Is it possible for you to measure that part too?