Re: [PATCH v2] mm/zsmalloc.c: close race window between zs_pool_dec_isolated() and zs_unregister_migration()
From: Miaohe Lin
Date: Thu Jul 08 2021 - 08:14:35 EST
Sorry for the disturbs! Please ignore this duplicated one...
On 2021/7/8 19:51, Miaohe Lin wrote:
> There has one possible race window between zs_pool_dec_isolated() and
> zs_unregister_migration() because wait_for_isolated_drain() checks the
> isolated count without holding class->lock and there is no order inside
> zs_pool_dec_isolated(). Thus the below race window could be possible:
>
> zs_pool_dec_isolated zs_unregister_migration
> check pool->destroying != 0
> pool->destroying = true;
> smp_mb();
> wait_for_isolated_drain()
> wait for pool->isolated_pages == 0
> atomic_long_dec(&pool->isolated_pages);
> atomic_long_read(&pool->isolated_pages) == 0
>
> Since we observe the pool->destroying (false) before atomic_long_dec()
> for pool->isolated_pages, waking pool->migration_wait up is missed.
>
> Fix this by ensure checking pool->destroying is happened after the
> atomic_long_dec(&pool->isolated_pages).
>
> Fixes: 701d678599d0 ("mm/zsmalloc.c: fix race condition in zs_destroy_pool")
> Signed-off-by: Miaohe Lin <linmiaohe@xxxxxxxxxx>
> ---
> v1->v2:
> Fix potential race window rather than simply combine atomic_long_dec
> and atomic_long_read.
>
> Hi Andrew,
> This patch is the version 2 of
> mm-zsmallocc-combine-two-atomic-ops-in-zs_pool_dec_isolated.patch.
> Many thanks.
> ---
> mm/zsmalloc.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 5f3df680f0a2..0fc388a0202d 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1830,10 +1830,11 @@ static inline void zs_pool_dec_isolated(struct zs_pool *pool)
> VM_BUG_ON(atomic_long_read(&pool->isolated_pages) <= 0);
> atomic_long_dec(&pool->isolated_pages);
> /*
> - * There's no possibility of racing, since wait_for_isolated_drain()
> - * checks the isolated count under &class->lock after enqueuing
> - * on migration_wait.
> + * Checking pool->destroying must happen after atomic_long_dec()
> + * for pool->isolated_pages above. Paired with the smp_mb() in
> + * zs_unregister_migration().
> */
> + smp_mb__after_atomic();
> if (atomic_long_read(&pool->isolated_pages) == 0 && pool->destroying)
> wake_up_all(&pool->migration_wait);
> }
>