Re: [RFC PATCH] zram: support asynchronous GC for lazy slot freeing

From: Barry Song

Date: Thu Apr 16 2026 - 04:14:31 EST


On Thu, Apr 16, 2026 at 3:41 PM Sergey Senozhatsky
<senozhatsky@xxxxxxxxxxxx> wrote:
>
> On (26/04/14 13:49), Xueyuan Chen wrote:
> > On Sun, Apr 12, 2026 at 07:48:48PM +0800, Kairui Song wrote:
> > [...]
> > >What is making this slot_free so costly? zs_free?
> >
> > Yes, I've captured some perf data on RK3588 cpu2:
> >
> > - 3.79% 0.42% zram [zram] [k] slot_free
> > - 89.04% slot_free
> > - 65.40% zs_free
> > + 77.29% free_zspage
> > + 21.75% kmem_cache_free
> > 0.68% __kern_my_cpu_offset
> > + 13.19% _raw_spin_unlock
> > + 4.86% _raw_read_unlock
> > 4.75% obj_free
> > + 4.72% _raw_read_lock
> > 3.64% fix_fullness_group
> > + 2.02% _raw_spin_lock
> > + 1.31% kmem_cache_free
> >
> > It's clear that zs_free is the primary hotspot, accounting for ~65.40%
> > of the total slot_free cycles. Beyond that, have some read and spin lock
> > in slot_free.
>
> Just a random thought, if zs_free() is costly then it likely also affects
> zswap, which makes me wonder if doing something on the zsmalloc side is a
> "batter" way forward.

Xueyuan's perf shows that 65.4% of slot_free is spent in
zs_free, so there is still around 35% elsewhere. However, this
might be a measurement issue. If we confirm the number is >=90%
or so, moving GC into zsmalloc seems like a better option. My
real use case is zram rather than zswap in the Android industry,
but this would benefit both zswap and zram.

Meanwhile, I would also like to try whether combining many bit
operations, such as clear_slot_flag(), can further help.

Thanks
Barry