Re: mm: zswap: fix crypto_free_acomp deadlock in zswap_cpu_comp_dead

From: Yosry Ahmed
Date: Tue Feb 25 2025 - 21:08:33 EST


On Wed, Feb 26, 2025 at 09:25:23AM +0800, Herbert Xu wrote:
> On Tue, Feb 25, 2025 at 01:43:41PM +0000, Yosry Ahmed wrote:
> >
> > Interesting, it's weird that crypto_free_acomp() allocates memory. Do you have the specific call path?
>
> crypto_free_acomp does not allocate memory. However, it takes
> the same mutex that is also taken on the allocation path.
>
> The specific call path can be seen in the original report:
>
> https://syzkaller.appspot.com/bug?extid=1a517ccfcbc6a7ab0f82

After staring at this for a while I think the following situation could
be the problem:

Task A running on CPU #1:
crypto_alloc_acomp_node()
Holds scomp_lock
Enters reclaim
Reads per_cpu_ptr(pool->acomp_ctx, cpu)

Task A is descheduled

zswap_cpu_comp_dead(CPU #1) // CPU #1 going offline
Holds per_cpu_ptr(pool->acomp_ctx, cpu))
Calls crypto_free_acomp()
Waits for scomp_lock

Task A running on CPU #2:
Waits for per_cpu_ptr(pool->acomp_ctx, cpu)
DEADLOCK

In this case I think the fix is correct, thanks for looking into it.

Could you please:

(1) Explain the exact scenario in the commit log, I did not understand
it at first, only after looking at the syzbot dashboard for a while (and
I am not sure how long this persists).

(2) Move all the freeing operations outside the mutex? Right now
crypto_free_acomp() was the problematic call but it could be
acomp_request_free() next.

Something like:

static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node)
{
struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu);
struct struct acomp_req *req;
struct crypto_acomp *acomp;
u8 *buffer;

if (IS_ERR_OR_NULL(acomp_ctx))
return 0;

mutex_lock(&acomp_ctx->mutex);
req = acomp_ctx->req;
acomp_ctx->req = NULL;
acomp = acomp_ctx->acomp;
acomp_ctx->acomp = NULL;
buffer = acomp_ctx->buffer;
acomp_ctx->buffer = NULL;
mutex_unlock(&acomp_ctx->mutex);

/*
* Do the actual freeing after releasing the mutex to avoid subtle
* locking dependencies causing deadlocks
*/
if (!IS_ERR_OR_NULL(req))
acomp_request_free(req);
if (!IS_ERR_OR_NULL(acomp))
crypto_free_acomp(acomp);
kfree(acomp_ctx->buffer);

return 0;
}