Re: [PATCH] mm/slab_common: fix possiable double free of kmem_cache

From: Vlastimil Babka
Date: Mon Sep 19 2022 - 05:12:51 EST


On 9/19/22 05:12, Feng Tang wrote:
> When doing slub_debug test, kfence's 'test_memcache_typesafe_by_rcu'
> kunit test case cause a use-after-free error:
>
> BUG: KASAN: use-after-free in kobject_del+0x14/0x30
> Read of size 8 at addr ffff888007679090 by task kunit_try_catch/261
>
> CPU: 1 PID: 261 Comm: kunit_try_catch Tainted: G B N 6.0.0-rc5-next-20220916 #17
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> Call Trace:
> <TASK>
> dump_stack_lvl+0x34/0x48
> print_address_description.constprop.0+0x87/0x2a5
> print_report+0x103/0x1ed
> kasan_report+0xb7/0x140
> kobject_del+0x14/0x30
> kmem_cache_destroy+0x130/0x170
> test_exit+0x1a/0x30
> kunit_try_run_case+0xad/0xc0
> kunit_generic_run_threadfn_adapter+0x26/0x50
> kthread+0x17b/0x1b0
> </TASK>
>
> The cause is inside kmem_cache_destroy():
>
> kmem_cache_destroy
> acquire lock/mutex
> shutdown_cache
> schedule_work(kmem_cache_release) (if RCU flag set)
> release lock/mutex
> kmem_cache_release (if RCU flag set)

^ not set

I've fixed that up.

>
> in some certain timing, the scheduled work could be run before
> the next RCU flag checking which will get a wrong state.
>
> Fix it by caching the RCU flag inside protected area, just like 'refcnt'
>
> Signed-off-by: Feng Tang <feng.tang@xxxxxxxxx>

Thanks!

> ---
>
> note:
>
> The error only happens on linux-next tree, and not in Linus' tree,
> which already has Waiman's commit:
> 0495e337b703 ("mm/slab_common: Deleting kobject in kmem_cache_destroy()
> without holding slab_mutex/cpu_hotplug_lock")

Actually that commit is already in Linus' rc5 too, so I will send your fix
this week too. Added a Fixes: 0495e337b703 (...) too.

> mm/slab_common.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 07b948288f84..ccc02573588f 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -475,6 +475,7 @@ void slab_kmem_cache_release(struct kmem_cache *s)
> void kmem_cache_destroy(struct kmem_cache *s)
> {
> int refcnt;
> + bool rcu_set;
>
> if (unlikely(!s) || !kasan_check_byte(s))
> return;
> @@ -482,6 +483,8 @@ void kmem_cache_destroy(struct kmem_cache *s)
> cpus_read_lock();
> mutex_lock(&slab_mutex);
>
> + rcu_set = s->flags & SLAB_TYPESAFE_BY_RCU;
> +
> refcnt = --s->refcount;
> if (refcnt)
> goto out_unlock;
> @@ -492,7 +495,7 @@ void kmem_cache_destroy(struct kmem_cache *s)
> out_unlock:
> mutex_unlock(&slab_mutex);
> cpus_read_unlock();
> - if (!refcnt && !(s->flags & SLAB_TYPESAFE_BY_RCU))
> + if (!refcnt && !rcu_set)
> kmem_cache_release(s);
> }
> EXPORT_SYMBOL(kmem_cache_destroy);