RE: [PATCH v4 09/10] mm: zswap: Allocate pool batching resources if the crypto_alg supports batching.

From: Sridhar, Kanchana P
Date: Mon Dec 02 2024 - 19:31:02 EST


Hi Nhat,

> -----Original Message-----
> From: Nhat Pham <nphamcs@xxxxxxxxx>
> Sent: Monday, December 2, 2024 11:16 AM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@xxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
> hannes@xxxxxxxxxxx; yosryahmed@xxxxxxxxxx;
> chengming.zhou@xxxxxxxxx; usamaarif642@xxxxxxxxx;
> ryan.roberts@xxxxxxx; ying.huang@xxxxxxxxx; 21cnbao@xxxxxxxxx;
> akpm@xxxxxxxxxxxxxxxxxxxx; linux-crypto@xxxxxxxxxxxxxxx;
> herbert@xxxxxxxxxxxxxxxxxxx; davem@xxxxxxxxxxxxx;
> clabbe@xxxxxxxxxxxx; ardb@xxxxxxxxxx; ebiggers@xxxxxxxxxx;
> surenb@xxxxxxxxxx; Accardi, Kristen C <kristen.c.accardi@xxxxxxxxx>;
> Feghali, Wajdi K <wajdi.k.feghali@xxxxxxxxx>; Gopal, Vinodh
> <vinodh.gopal@xxxxxxxxx>
> Subject: Re: [PATCH v4 09/10] mm: zswap: Allocate pool batching resources if
> the crypto_alg supports batching.
>
> On Fri, Nov 22, 2024 at 11:01 PM Kanchana P Sridhar
> <kanchana.p.sridhar@xxxxxxxxx> wrote:
> >
> > This patch does the following:
> >
> > 1) Modifies the definition of "struct crypto_acomp_ctx" to represent a
> > configurable number of acomp_reqs and buffers. Adds a "nr_reqs" to
> > "struct crypto_acomp_ctx" to contain the nr of resources that will be
> > allocated in the cpu onlining code.
> >
> > 2) The zswap_cpu_comp_prepare() cpu onlining code will detect if the
> > crypto_acomp created for the pool (in other words, the zswap
> compression
> > algorithm) has registered an implementation for batch_compress() and
> > batch_decompress(). If so, it will set "nr_reqs" to
> > SWAP_CRYPTO_BATCH_SIZE and allocate these many reqs/buffers, and
> set
> > the acomp_ctx->nr_reqs accordingly. If the crypto_acomp does not
> support
> > batching, "nr_reqs" defaults to 1.
> >
> > 3) Adds a "bool can_batch" to "struct zswap_pool" that step (2) will set to
> > true if the batching API are present for the crypto_acomp.
>
> Why do we need this "can_batch" field? IIUC, this can be determined
> from the compressor internal fields itself, no?
>
> acomp_has_async_batching(acomp);
>
> Is this just for convenience, or is this actually an expensive thing to compute?

Thanks for your comments. This is a good question. I tried not to imply that
batching resources have been allocated for the cpu based only on what
acomp_has_async_batching() returns. It is possible that the cpu onlining
code ran into an -ENOMEM error on any particular cpu. In this case, I set
the pool->can_batch to "false", mainly for convenience, so that zswap
can be somewhat insulated from migration. I agree that this may not be
the best solution; and whether or not batching is enabled can be directly
determined just before the call to crypto_acomp_batch_compress()
based on:

acomp_ctx->nr_reqs == SWAP_CRYPTO_BATCH_SIZE;

I currently have a BUG_ON() for this condition not being met, that relies
on the pool->can_batch gating the flow to get to zswap_batch_compress().

I think a better solution would be to check for having SWAP_CRYPTO_BATCH_SIZE
# of acomp_ctx resources right after we acquire the acomp_ctx->mutex and before
the call to crypto_acomp_batch_compress(). If so, we proceed, and if not, we call
crypto_acomp_compress(). It seems this might be the only way to know for sure
whether the crypto batching API can be called, given that migration is possible
at any point in zswap_store(). Once we have obtained the mutex_lock, it seems
we can proceed with batching based on this check (although the UAF situation
remains as a larger issue, beyond the scope of this patch). I would appreciate
other ideas as well.

Also, I have submitted a patch-series [1] with Yosry's & Johannes' suggestions
to this series. This is setting up a consolidated zswap_store()/zswap_store_pages()
code path for batching and non-batching compressors. My goal is for [1] to
go through code reviews and be able to transition to batching, with a simple
check:

if (acomp_ctx->nr_reqs == SWAP_CRYPTO_BATCH_SIZE)
zswap_batch_compress();
else
zswap_compress();

Please feel free to provide code review comments in [1]. Thanks!

[1]: https://patchwork.kernel.org/project/linux-mm/list/?series=912937

>
> >
> > SWAP_CRYPTO_BATCH_SIZE is set to 8, which will be the IAA compress
> batching
>
> I like a sane default value as much as the next guy, but this seems a
> bit odd to me:
>
> 1. The placement of this constant/default value seems strange to me.
> This is a compressor-specific value no? Why are we enforcing this
> batching size at the zswap level, and uniformly at that? What if we
> introduce a new batch compression algorithm...? Or am I missing
> something, and this is a sane default for other compressors too?

You bring up an excellent point. This is a compressor-specific value.
Instead of setting this up as a constant, which as you correctly observe,
may not make sense for a non-IAA compressor, one way to get
this could be by querying the compressor, say:

int acomp_get_max_batchsize(struct crypto_acomp *tfm) {...};

to then allocate sufficient acomp_reqs/buffers/etc. in the zswap
cpu onlining code.

>
> 2. Why is this value set to 8? Experimentation? Could you add some
> justification in documentation?

Can I get back to you later this week with a proposal for this? We plan
to have a team discussion on how best to approach this for current
and future hardware.

Thanks,
Kanchana