Re: [PATCH v1] driver core: Fix scheduling while atomic warnings during device link deletion

From: Saravana Kannan
Date: Thu Jul 16 2020 - 14:26:46 EST


On Wed, Jul 15, 2020 at 10:48 PM Marek Szyprowski
<m.szyprowski@xxxxxxxxxxx> wrote:
>
> Hi
>
> On 16.07.2020 07:30, Guenter Roeck wrote:
> > On 7/15/20 10:08 PM, Saravana Kannan wrote:
> >> Marek and Guenter reported that commit 287905e68dd2 ("driver core:
> >> Expose device link details in sysfs") caused sleeping/scheduling while
> >> atomic warnings.
> >>
> >> BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
> >> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 12, name: kworker/0:1
> >> 2 locks held by kworker/0:1/12:
> >> #0: ee8074a8 ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x174/0x7dc
> >> #1: ee921f20 ((work_completion)(&sdp->work)){+.+.}-{0:0}, at: process_one_work+0x174/0x7dc
> >> Preemption disabled at:
> >> [<c01b10f0>] srcu_invoke_callbacks+0xc0/0x154
> >> ----- 8< ----- SNIP
> >> [<c064590c>] (device_del) from [<c0645c9c>] (device_unregister+0x24/0x64)
> >> [<c0645c9c>] (device_unregister) from [<c01b10fc>] (srcu_invoke_callbacks+0xcc/0x154)
> >> [<c01b10fc>] (srcu_invoke_callbacks) from [<c01493c4>] (process_one_work+0x234/0x7dc)
> >> [<c01493c4>] (process_one_work) from [<c01499b0>] (worker_thread+0x44/0x51c)
> >> [<c01499b0>] (worker_thread) from [<c0150bf4>] (kthread+0x158/0x1a0)
> >> [<c0150bf4>] (kthread) from [<c0100114>] (ret_from_fork+0x14/0x20)
> >> Exception stack(0xee921fb0 to 0xee921ff8)
> >>
> >> This was caused by the device link device being released in the context
> >> of srcu_invoke_callbacks(). There is no need to wait till the RCU
> >> callback to release the device link device. So release the device
> >> earlier and revert the RCU callback code to what it was before
> >> commit 287905e68dd2 ("driver core: Expose device link details in sysfs")
> >>
> >> Fixes: 287905e68dd2 ("driver core: Expose device link details in sysfs")
> >> Reported-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
> >> Reported-by: Guenter Roeck <linux@xxxxxxxxxxxx>
> >> Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx>
> >> ---
> >> Marek and Guenter,
> >>
> >> It haven't had a chance to test this yet. Can one of you please test it
> >> and confirm it fixes the issue?
> >>
> > With this patch applied, the original warning is gone, but I get lots
> > of other warnings.
> >
> > WARNING: CPU: 0 PID: 1 at drivers/base/core.c:1790 device_release+0x94/0xa4^M
> > Device 'regulators:regulator@0:50038000.ethernet' does not have a release() function, it is broken and must be fixed.
> >
> > WARNING: CPU: 0 PID: 1 at drivers/base/core.c:1790 device_release+0x94/0xa4
> > Device '53f9c000.gpio:50038000.ethernet' does not have a release() function, it is broken and must be fixed.
> >
> > WARNING: CPU: 0 PID: 1 at drivers/base/core.c:1790 device_release+0x94/0xa4^M
> > Device '50030000.tscadc:50030400.tcq' does not have a release() function, it is broken and must be fixed.
>
> I confirm that I also get such warnings for every platform device in the
> system with this patch applied to linux next-20200715:

Sigh... I should refrain from late night coding. I'll send a fix in a few hours.

-Saravana