Re: [PATCH v7] mm/zswap: move to use crypto_acomp API for hardware acceleration
From: Sebastian Andrzej Siewior
Date: Mon Nov 09 2020 - 05:29:14 EST
I've been looking at the patch and it looks like it should work. Having
numbers to backup the performance in the pure-software version and with
HW acceleration would _very_ nice to have.
On 2020-11-07 19:53:32 [+1300], Barry Song wrote:
> index fbb7829..73f04de 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -415,30 +445,54 @@ static int zswap_dstmem_dead(unsigned int cpu)
…
> + acomp_ctx->req = req;
> +
> + crypto_init_wait(&acomp_ctx->wait);
> + /*
> + * if the backend of acomp is async zip, crypto_req_done() will wakeup
> + * crypto_wait_req(); if the backend of acomp is scomp, the callback
> + * won't be called, crypto_wait_req() will return without blocking.
> + */
> + acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
> + crypto_req_done, &acomp_ctx->wait);
> +
> + acomp_ctx->mutex = per_cpu(zswap_mutex, cpu);
> + acomp_ctx->dstmem = per_cpu(zswap_dstmem, cpu);
You added a comment here and there you never mentioned that this single
per-CPU mutex protects the per-CPU context (which you can have more than
one on a single CPU) and the scratch/dstmem which is one per-CPU. Of
course if you read the code you figure it out.
I still think that you should have a pool of memory and crypto contexts
which you can use instead of having them strictly per-CPU. The code is
fully preemptible and you may have multiple requests on the same CPU.
Yes, locking works but at the same you block processing while waiting on
a lock and the "reserved memory" on other CPUs remains unused.
Sebastian