Re: [PATCH] net: macb: fix sleep inside spinlock

From: Paolo Abeni
Date: Tue Sep 12 2023 - 03:09:43 EST


On Fri, 2023-09-08 at 13:29 +0200, Sascha Hauer wrote:
> macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate()
> which can sleep. This results in:
>
> > BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
> > pps pps1: new PPS source ptp1
> > in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3
> > preempt_count: 1, expected: 0
> > RCU nest depth: 0, expected: 0
> > 4 locks held by kworker/u4:3/40:
> > #0: ffff000003409148
> > macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered.
> > ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
> > #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
> > #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8
> > #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac
> > irq event stamp: 113998
> > hardirqs last enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64
> > hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8
> > softirqs last enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4
> > softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c
> > CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368
> > Hardware name: ... ZynqMP ... (DT)
> > Workqueue: events_power_efficient phylink_resolve
> > Call trace:
> > dump_backtrace+0x98/0xf0
> > show_stack+0x18/0x24
> > dump_stack_lvl+0x60/0xac
> > dump_stack+0x18/0x24
> > __might_resched+0x144/0x24c
> > __might_sleep+0x48/0x98
> > __mutex_lock+0x58/0x7b0
> > mutex_lock_nested+0x24/0x30
> > clk_prepare_lock+0x4c/0xa8
> > clk_set_rate+0x24/0x8c
> > macb_mac_link_up+0x25c/0x2ac
> > phylink_resolve+0x178/0x4e8
> > process_one_work+0x1ec/0x51c
> > worker_thread+0x1ec/0x3e4
> > kthread+0x120/0x124
> > ret_from_fork+0x10/0x20
>
> The obvious fix is to move the call to macb_set_tx_clk() out of the
> protected area. This seems safe as rx and tx are both disabled anyway at
> this point.
> It is however not entirely clear what the spinlock shall protect. It
> could be the read-modify-write access to the NCFGR register, but this
> is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well
> without holding the spinlock. It could also be the register accesses
> done in mog_init_rings() or macb_init_buffers(), but again these
> functions are called without holding the spinlock in macb_hresp_error_task().
> The locking seems fishy in this driver and it might deserve another look
> before this patch is applied.

macb_set_tx_clk() moved under the bp->lock scope as a consequence of
the blamed commit. Such commit moved a bunch of code originally in
macb_mac_config() and under the bp->lock lock into macb_mac_link_up().

I guess that the lock was added to the latter function to be on the
"safe side", but it looks like macb_set_tx_clk() does not need it.

The patch LGTM, thanks!

Paolo