RE: [PATCH v13 19/22] mm: zswap: Per-CPU acomp_ctx resources exist from pool creation to deletion.
From: Sridhar, Kanchana P
Date: Fri Dec 12 2025 - 15:53:18 EST
> -----Original Message-----
> From: Yosry Ahmed <yosry.ahmed@xxxxxxxxx>
> Sent: Friday, December 12, 2025 10:44 AM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
> hannes@xxxxxxxxxxx; nphamcs@xxxxxxxxx; chengming.zhou@xxxxxxxxx;
> usamaarif642@xxxxxxxxx; ryan.roberts@xxxxxxx; 21cnbao@xxxxxxxxx;
> ying.huang@xxxxxxxxxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx;
> senozhatsky@xxxxxxxxxxxx; sj@xxxxxxxxxx; kasong@xxxxxxxxxxx; linux-
> crypto@xxxxxxxxxxxxxxx; herbert@xxxxxxxxxxxxxxxxxxx;
> davem@xxxxxxxxxxxxx; clabbe@xxxxxxxxxxxx; ardb@xxxxxxxxxx;
> ebiggers@xxxxxxxxxx; surenb@xxxxxxxxxx; Accardi, Kristen C
> <kristen.c.accardi@xxxxxxxxx>; Gomes, Vinicius <vinicius.gomes@xxxxxxxxx>;
> Feghali, Wajdi K <wajdi.k.feghali@xxxxxxxxx>; Gopal, Vinodh
> <vinodh.gopal@xxxxxxxxx>
> Subject: Re: [PATCH v13 19/22] mm: zswap: Per-CPU acomp_ctx resources
> exist from pool creation to deletion.
>
> On Fri, Dec 12, 2025 at 06:17:07PM +0000, Sridhar, Kanchana P wrote:
> > >
> > > > ret =
> > > cpuhp_state_add_instance(CPUHP_MM_ZSWP_POOL_PREPARE,
> > > > &pool->node);
> > > > if (ret)
> > > > - goto error;
> > > > + goto ref_fail;
> > >
> > > IIUC we shouldn't call cpuhp_state_remove_instance() on failure, we
> > > probably should add a new label.
> >
> > In this case we should because it is part of the pool creation failure
> > handling flow, at the end of which, the pool will be deleted.
>
> What I mean is, when cpuhp_state_add_instance() fails we goto ref_fail
> which will call cpuhp_state_remove_instance(). But the current code does
> not call cpuhp_state_remove_instance() if cpuhp_state_add_instance()
> fails.
I see what you mean. The current mainline code does not call
cpuhp_state_remove_instance() if cpuhp_state_add_instance() fails, because
the cpuhotplug code will call the teardown callback in this case.
In this patch, I do need to call cpuhp_state_remove_instance() and
acomp_ctx_dealloc() in this case because there is no teardown callback
being registered.
>
> >
> > >
> > > >
> > > > /* being the current pool takes 1 ref; this func expects the
> > > > * caller to always add the new pool as the current pool
> > > > @@ -292,6 +313,9 @@ static struct zswap_pool
> *zswap_pool_create(char
> > > *compressor)
> > > >
> > > > ref_fail:
> > > > cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE,
> > > &pool->node);
> > > > +
> > > > + for_each_possible_cpu(cpu)
> > > > + acomp_ctx_dealloc(per_cpu_ptr(pool->acomp_ctx, cpu));
> > > > error:
> > > > if (pool->acomp_ctx)
> > > > free_percpu(pool->acomp_ctx);
> > > > @@ -322,9 +346,15 @@ static struct zswap_pool
> > > *__zswap_pool_create_fallback(void)
> > > >
> > > > static void zswap_pool_destroy(struct zswap_pool *pool)
> > > > {
> > > > + int cpu;
> > > > +
> > > > zswap_pool_debug("destroying", pool);
> > > >
> > > > cpuhp_state_remove_instance(CPUHP_MM_ZSWP_POOL_PREPARE,
> > > &pool->node);
> > > > +
> > > > + for_each_possible_cpu(cpu)
> > > > + acomp_ctx_dealloc(per_cpu_ptr(pool->acomp_ctx, cpu));
> > > > +
> > > > free_percpu(pool->acomp_ctx);
> > > >
> > > > zs_destroy_pool(pool->zs_pool);
> > > > @@ -736,39 +766,35 @@ static int
> zswap_cpu_comp_prepare(unsigned int
> > > cpu, struct hlist_node *node)
> > > > {
> > > > struct zswap_pool *pool = hlist_entry(node, struct zswap_pool,
> > > node);
> > > > struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool-
> > > >acomp_ctx, cpu);
> > > > - struct crypto_acomp *acomp = NULL;
> > > > - struct acomp_req *req = NULL;
> > > > - u8 *buffer = NULL;
> > > > - int ret;
> > > > + int ret = -ENOMEM;
> > > >
> > > > - buffer = kmalloc_node(PAGE_SIZE, GFP_KERNEL, cpu_to_node(cpu));
> > > > - if (!buffer) {
> > > > - ret = -ENOMEM;
> > > > - goto fail;
> > > > - }
> > > > + /*
> > > > + * To handle cases where the CPU goes through online-offline-online
> > > > + * transitions, we return if the acomp_ctx has already been initialized.
> > > > + */
> > > > + if (!IS_ERR_OR_NULL(acomp_ctx->acomp))
> > > > + return 0;
> > >
> > > Is it possible for acomp_ctx->acomp to be an ERR value here? If it is,
> > > then zswap initialization should have failed. Maybe WARN_ON_ONCE() for
> > > that case?
> >
> > This is checking for a valid acomp_ctx->acomp using the same criteria
> > being uniformly used in acomp_ctx_dealloc(). This check is necessary to
> > handle the case where the CPU goes through online-offline-online state
> > transitions.
>
> I think I am confused. I thought now we don't free this on CPU offline,
> so either it's NULL because this is the first time we initialize it on
> this CPU, or it is allocated.
Yes, this is correct.
> If it is an ERR value, then the pool
> creation should have failed and we wouldn't be calling this again on CPU
> online.
>
> In other words, what scenario do we expect to legitimately see an ERR
> value here?
I am using "(!IS_ERR_OR_NULL(acomp_ctx->acomp)" as a check for the
acomp being allocated already. I could instead have used "if (acomp_ctx->acomp)",
but use the former to be consistent with patch 20/22.
I cannot think of a scenario where we can expect an ERR value here.
Thanks,
Kanchana