Re: [PATCH v2] driver core: Fix sleeping in invalid context during device link deletion

From: Saravana Kannan
Date: Thu Jul 16 2020 - 19:09:22 EST


On Thu, Jul 16, 2020 at 3:13 PM Marek Szyprowski
<m.szyprowski@xxxxxxxxxxx> wrote:
>
> Hi Saravana,
>
> On 16.07.2020 23:45, Saravana Kannan wrote:
> > Marek and Guenter reported that commit 287905e68dd2 ("driver core:
> > Expose device link details in sysfs") caused sleeping/scheduling while
> > atomic warnings.
> >
> > BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
> > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 12, name: kworker/0:1
> > 2 locks held by kworker/0:1/12:
> > #0: ee8074a8 ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x174/0x7dc
> > #1: ee921f20 ((work_completion)(&sdp->work)){+.+.}-{0:0}, at: process_one_work+0x174/0x7dc
> > Preemption disabled at:
> > [<c01b10f0>] srcu_invoke_callbacks+0xc0/0x154
> > ----- 8< ----- SNIP
> > [<c064590c>] (device_del) from [<c0645c9c>] (device_unregister+0x24/0x64)
> > [<c0645c9c>] (device_unregister) from [<c01b10fc>] (srcu_invoke_callbacks+0xcc/0x154)
> > [<c01b10fc>] (srcu_invoke_callbacks) from [<c01493c4>] (process_one_work+0x234/0x7dc)
> > [<c01493c4>] (process_one_work) from [<c01499b0>] (worker_thread+0x44/0x51c)
> > [<c01499b0>] (worker_thread) from [<c0150bf4>] (kthread+0x158/0x1a0)
> > [<c0150bf4>] (kthread) from [<c0100114>] (ret_from_fork+0x14/0x20)
> > Exception stack(0xee921fb0 to 0xee921ff8)
> >
> > This was caused by the device link device being released in the context
> > of srcu_invoke_callbacks(). There is no need to wait till the RCU
> > callback to release the device link device. So release the device
> > earlier and move the call_srcu() into the device release code. That way,
> > the memory will get freed only after the device is released AND the RCU
> > callback is called.
> >
> > Fixes: 287905e68dd2 ("driver core: Expose device link details in sysfs")
> > Reported-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
> > Reported-by: Guenter Roeck <linux@xxxxxxxxxxxx>
> > Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx>
> > ---
> >
> > v1->v2:
> > - Better fix
> > - Changed subject
> > - v1 is this patch https://lore.kernel.org/lkml/20200716050846.2047110-1-saravanak@xxxxxxxxxx/
> >
> > Marek and Guenter,
> >
> > I reproduced the original issue and tested this fix. Seems to work for
> > me. Can you confirm?
>
> Confirmed, this one fixes the issue! :)
>
> Tested-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>

Thanks!

-Saravana