Re: BUG in alloc_workqueue (linux-next)
From: Lai Jiangshan
Date: Thu Jul 08 2021 - 23:59:15 EST
Hello, Pavel
Thanks for the report.
Huawei (CC-ed) is also dealing with the problem:
https://lore.kernel.org/lkml/20210708093136.2195752-1-yangyingliang@xxxxxxxxxx/t/#u
Could you have a try on the fix, please?
Thanks
Lai
On Thu, Jul 8, 2021 at 9:24 PM Pavel Skripkin <paskripkin@xxxxxxxxx> wrote:
>
> I've spent some time trying to came up with a fix, but I gave
> up :( But! I have an idea about what's happening, maybe it will help
> somehow...
>
>
> So, all 3 reports have same stack trace: alloc_workqueue() in
> loop_configure(). I skimmed through syzbot's log and found, that syzbot injected
> failure into alloc_unbound_pwq() in all 3 cases:
>
> FAULT_INJECTION: forcing a failure.
> name failslab, interval 1, probability 0, space 0, times 0
> CPU: 1 PID: 17986 Comm: syz-executor.0 Tainted: G W 5.13.0-next-20210706 #9
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
> Call Trace:
> dump_stack_lvl (lib/dump_stack.c:106 (discriminator 4))
> should_fail.cold (lib/fault-inject.c:52 lib/fault-inject.c:146)
> should_failslab (mm/slab_common.c:1327)
> kmem_cache_alloc_node (mm/slab.h:487 mm/slub.c:2902 mm/slub.c:3017)
> ? alloc_unbound_pwq (kernel/workqueue.c:3813)
> alloc_unbound_pwq (kernel/workqueue.c:3813)
> apply_wqattrs_prepare (kernel/workqueue.c:3963)
> apply_workqueue_attrs_locked (kernel/workqueue.c:4041)
> alloc_workqueue (kernel/workqueue.c:4078 kernel/workqueue.c:4201 kernel/workqueue.c:4309)
>
>
> So, if alloc_unbound_pwq() fails, apply_wqattrs_prepare() will jump to
> this code:
>
> out_free:
> free_workqueue_attrs(tmp_attrs);
> free_workqueue_attrs(new_attrs);
> apply_wqattrs_cleanup(ctx); <----|
> return NULL; |
> |
> put_pwq_unlocked() -> put_pwq() -> schedule_work(&pwq->unbound_release_work);
>
>
> and apply_wqattrs_cleanup() will schedule pwq_unbound_release_workfn()
> [2], but alloc_workqueue() will free workqueue_struct in case of
> alloc_unbound_pwq() error [1]. In that case we will get UAF in pwq_unbound_release_workfn()
> like in 3rd report.
>
>
> Does written above make some sence? :)
>
>
>
> With regards,
> Pavel Skripkin