Re: [PATCH v3 2/2] mm: zswap: Tie per-CPU acomp_ctx lifetime to the pool.

From: Kanchana P. Sridhar

Date: Mon Apr 13 2026 - 16:41:33 EST


On Sun, Apr 12, 2026 at 2:42 PM Nhat Pham <nphamcs@xxxxxxxxx> wrote:
>
> On Tue, Mar 31, 2026 at 11:34 AM Kanchana P. Sridhar
> <kanchanapsridhar2026@xxxxxxxxx> wrote:
> >
> > Currently, per-CPU acomp_ctx are allocated on pool creation and/or CPU
> > hotplug, and destroyed on pool destruction or CPU hotunplug. This
> > complicates the lifetime management to save memory while a CPU is
> > offlined, which is not very common.
> >
> > Simplify lifetime management by allocating per-CPU acomp_ctx once on
> > pool creation (or CPU hotplug for CPUs onlined later), and keeping them
> > allocated until the pool is destroyed.
> >
> > Refactor cleanup code from zswap_cpu_comp_dead() into
> > acomp_ctx_free() to be used elsewhere.
> >
> > The main benefit of using the CPU hotplug multi state instance startup
> > callback to allocate the acomp_ctx resources is that it prevents the
> > cores from being offlined until the multi state instance addition call
> > returns.
> >
> > From Documentation/core-api/cpu_hotplug.rst:
> >
> > "The node list add/remove operations and the callback invocations are
> > serialized against CPU hotplug operations."
> >
> > Furthermore, zswap_[de]compress() cannot contend with
> > zswap_cpu_comp_prepare() because:
> >
> > - During pool creation/deletion, the pool is not in the zswap_pools
> > list.
> >
> > - During CPU hot[un]plug, the CPU is not yet online, as Yosry pointed
> > out. zswap_cpu_comp_prepare() will be run on a control CPU,
> > since CPUHP_MM_ZSWP_POOL_PREPARE is in the PREPARE section of "enum
> > cpuhp_state".
> >
> > In both these cases, any recursions into zswap reclaim from
> > zswap_cpu_comp_prepare() will be handled by the old pool.
> >
> > The above two observations enable the following simplifications:
> >
> > 1) zswap_cpu_comp_prepare():
> >
> > a) acomp_ctx mutex locking:
> >
> > If the process gets migrated while zswap_cpu_comp_prepare() is
> > running, it will complete on the new CPU. In case of failures, we
> > pass the acomp_ctx pointer obtained at the start of
> > zswap_cpu_comp_prepare() to acomp_ctx_free(), which again, can
> > only undergo migration. There appear to be no contention
> > scenarios that might cause inconsistent values of acomp_ctx's
> > members. Hence, it seems there is no need for
> > mutex_lock(&acomp_ctx->mutex) in zswap_cpu_comp_prepare().
> >
> > b) acomp_ctx mutex initialization:
> >
> > Since the pool is not yet on zswap_pools list, we don't need to
> > initialize the per-CPU acomp_ctx mutex in
> > zswap_pool_create(). This has been restored to occur in
> > zswap_cpu_comp_prepare().
> >
> > c) Subsequent CPU offline-online transitions:
> >
> > zswap_cpu_comp_prepare() checks upfront if acomp_ctx->acomp is
> > valid. If so, it returns success. This should handle any CPU
> > hotplug online-offline transitions after pool creation is done.
> >
> > 2) CPU offline vis-a-vis zswap ops:
> >
> > Let's suppose the process is migrated to another CPU before the
> > current CPU is dysfunctional. If zswap_[de]compress() holds the
> > acomp_ctx->mutex lock of the offlined CPU, that mutex will be
> > released once it completes on the new CPU. Since there is no
> > teardown callback, there is no possibility of UAF.
> >
> > 3) Pool creation/deletion and process migration to another CPU:
> >
> > During pool creation/deletion, the pool is not in the zswap_pools
> > list. Hence it cannot contend with zswap ops on that CPU. However,
> > the process can get migrated.
> >
> > a) Pool creation --> zswap_cpu_comp_prepare()
> > --> process migrated:
> > * Old CPU offline: no-op.
> > * zswap_cpu_comp_prepare() continues
> > to run on the new CPU to finish
> > allocating acomp_ctx resources for
> > the offlined CPU.
> >
> > b) Pool deletion --> acomp_ctx_free()
> > --> process migrated:
> > * Old CPU offline: no-op.
> > * acomp_ctx_free() continues
> > to run on the new CPU to finish
> > de-allocating acomp_ctx resources
> > for the offlined CPU.
> >
> > 4) Pool deletion vis-a-vis CPU onlining:
> >
> > The call to cpuhp_state_remove_instance() cannot race with
> > zswap_cpu_comp_prepare() because of hotplug synchronization.
> >
> > The current acomp_ctx_get_cpu_lock()/acomp_ctx_put_unlock() are
> > deleted. Instead, zswap_[de]compress() directly call
> > mutex_[un]lock(&acomp_ctx->mutex).
> >
> > The per-CPU memory cost of not deleting the acomp_ctx resources upon CPU
> > offlining, and only deleting them when the pool is destroyed, is 8.28 KB
> > on x86_64. This cost is only paid when a CPU is offlined, until it is
> > onlined again.
> >
> > Co-developed-by: Kanchana P. Sridhar <kanchanapsridhar2026@xxxxxxxxx>
> > Signed-off-by: Kanchana P. Sridhar <kanchanapsridhar2026@xxxxxxxxx>
> > Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@xxxxxxxxx>
>
> Lol.
>
> > Acked-by: Yosry Ahmed <yosry@xxxxxxxxxx>
>
> Thanks for simplifying this :) My brain always hurts when I have to
> handle CPU offlining for per-cpu structures. I had to deal with this
> because I added per-CPU caching for a structure (with reference
> counting) in another patch series of mine :)
>
> Acked-by: Nhat Pham <nphamcs@xxxxxxxxx>

Good to know, Nhat! And thanks for the Acked-by :)

Best regards,
Kanchana