Re: [RFC PATCH v2 0/4] mm/zsmalloc: reduce zs_free() latency on swap release path

Next message: Kohei Enju: "Re: [syzbot] [kvm?] [net?] [virt?] BUG: sleeping function called from invalid context in vhost_get_avail_idx"
Previous message: Yosry Ahmed: "Re: [PATCH v5 00/21] Virtual Swap Space"
In reply to: Nhat Pham: "Re: [RFC PATCH v2 0/4] mm/zsmalloc: reduce zs_free() latency on swap release path"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Xueyuan Chen

Date: Tue Apr 21 2026 - 20:35:45 EST

On Tue, Apr 21, 2026 at 11:25:17AM -0700, Nhat Pham wrote:

[...]

>Hmm, free_zspage() and kmem_cache_free().
>
>* kmem_cache_free() is just handle freeing. Bulk-freeing?
>
>* free_zspage() looks like just ordinary teardown work :( Seems like
>we're not spinning any lock here - we just try lock the backing pages,
>and the rest is normal work. Not sure how to optimize this - perhaps
>deferring is the only way.
>
>

Hi Nhat,

Currently, free_zspage() is called while holding the class->lock.
However, free_zspage() eventually invokes folio_put(), which may acquire
the zone->lock.

This creates a nested lock dependency. If multiple CPUs contend for the
same class->lock and the current holder is stalled waiting for the
zone->lock, it significantly extends the hold time of the class->lock.
This causes other CPUs to wait much longer.

Here is the ftrace data showing the severe contention on class->lock.
Under contention, the time spent in queued_spin_lock_slowpath() jumps
from ~1.3us to over 30us, significantly increasing the total latency
of zs_free().

7) | zs_free() {
7) 0.220 us | _raw_read_lock();
7) | _raw_spin_lock() {
7) 1.320 us | queued_spin_lock_slowpath();
7) 1.820 us | }
7) 0.170 us | _raw_read_unlock();
7) 0.170 us | obj_free();
7) 0.190 us | fix_fullness_group();
7) 0.150 us | _raw_spin_unlock();
7) 0.170 us | kmem_cache_free();
7) 4.610 us | }

---------------------------------------------------------

7) | zs_free() {
7) 0.230 us | _raw_read_lock();
7) | _raw_spin_lock() {
7) + 30.100 us | queued_spin_lock_slowpath();
7) + 30.600 us | }
7) 0.200 us | _raw_read_unlock();
7) 0.170 us | obj_free();
7) 0.170 us | fix_fullness_group();
7) 0.170 us | _raw_spin_unlock();
7) 0.210 us | kmem_cache_free();
7) + 33.850 us | }

Best regards,
Xueyuan