Re: [RFC PATCH v2 2/4] mm/zsmalloc: introduce zs_free_deferred() for async handle freeing

From: Barry Song

Date: Tue Apr 21 2026 - 17:43:16 EST

On Wed, Apr 22, 2026 at 3:47 AM Nhat Pham <nphamcs@xxxxxxxxx> wrote:
>
> On Tue, Apr 21, 2026 at 5:16 AM Wenchao Hao <haowenchao22@xxxxxxxxx> wrote:
> >
> > zs_free() is expensive due to internal locking (pool->lock, class->lock)
> > and potential zspage freeing. On the process exit path, the slow
> > zs_free() blocks memory reclamation, delaying overall memory release.
> > This has been reported to significantly impact Android low-memory
> > killing where slot_free() accounts for over 80% of the total swap
> > entry freeing cost.
> >
> > Introduce zs_free_deferred() which queues handles into a fixed-size
> > per-pool array for later processing by a workqueue. This allows callers
> > to defer the expensive zs_free() and return quickly, so the process
> > exit path can release memory faster. The array capacity is derived from
> > a 128MB uncompressed data budget (128MB >> PAGE_SHIFT entries), which
> > scales naturally with PAGE_SIZE. When the array reaches half capacity,
> > the workqueue is scheduled to drain pending handles.
> >
> > zs_free_deferred() uses spin_trylock() to access the deferred queue.
> > If the lock is contended (e.g. drain in progress) or the queue is full,
> > it falls back to synchronous zs_free() to guarantee correctness.
> >
> > Also introduce zs_free_deferred_flush() for use during pool teardown to
> > ensure all pending handles are freed.
>
> Hmmm per-pool workqueue.
>
> Does that mean that if you only have one zs pool (in the case of
> zswap, or if you only have one zram device), you'll have less
> concurrency in freeing up zsmalloc memory for process teardown? Would
> this be problematic?

I believe so, as reported in the original email from Lei and Zhiguo,
which proposed introducing a swap entries list for async free.

>
> I think Kairui was also suggesting per-cpu-fying these batches/queues.

I guess a per–size-class workqueue might strike a balance
between scalability and reducing lock contention across
multiple classes, where the locks actually reside.

Thanks
Barry