On Tue, Dec 17, 2024 at 9:59 AM Christophe Leroy
<christophe.leroy@xxxxxxxxxx> wrote:
Le 17/12/2024 à 09:16, Eric Dumazet a écrit :
On Tue, Dec 17, 2024 at 8:18 AM Christophe Leroy
<christophe.leroy@xxxxxxxxxx> wrote:
The following problem is encountered on kernel built with
CONFIG_PREEMPT. An snmp daemon running with normal priority is
regularly calling ioctl(SIOCGMIIPHY). Another process running with
SCHED_FIFO policy is regularly reading /sys/class/net/eth0/carrier.
After some random time, the snmp daemon gets preempted while holding
the RTNL mutex then the high priority process is busy looping into
carrier_show which bails out early due to a non-successfull
rtnl_trylock() which implies restart_syscall(). Because the snmp
daemon has a lower priority, it never gets the chances to release
the RTNL mutex and the high-priority task continues to loop forever.
Replace the trylock by lock_interruptible. This will increase the
priority of the task holding the lock so that it can release it and
allow the reader of /sys/class/net/eth0/carrier to actually perform
its read.
...
Fixes: 336ca57c3b4e ("net-sysfs: Use rtnl_trylock in sysfs methods.")
Signed-off-by: Christophe Leroy <christophe.leroy@xxxxxxxxxx>
---
At a first glance, this might resurface the deadlock issue Eric W. Biederman
was trying to fix in 336ca57c3b4e ("net-sysfs: Use rtnl_trylock in
sysfs methods.")
Are you talking about the deadlock fixed (incompletely) by 5a5990d3090b
("net: Avoid race between network down and sysfs"), or the complement
provided by 336ca57c3b4e ?
My understanding is that mutex_lock() will return EINTR only if a signal
is pending so there is no need to set signal_pending like it was when
using mutex_trylock() which does nothing when the mutex is already locked.
And an EINTR return is expected and documented for a read() or a
write(), I can't see why we want ERESTARTNOINTR instead of ERSTARTSYS.
Isn't it the responsibility of the user app to call again read or write
if it has decided to not install the necessary sigaction for an
automatic restart ?
Do you think I should instead use rtnl_lock_killable() and return
ERESTARTNOINTR in case of failure ? In that case, is it still possible
to interrupt a blocked 'cat /sys/class/net/eth0/carrier' which CTRL+C ?
Issue is when no signal is pending, we have a typical deadlock situation :
One process A is :
Holding sysfs lock, then attempts to grab rtnl.
Another one (B) is :
Holding rtnl, then attempts to grab sysfs lock.
Using rtnl_lock_interruptible() in A will still block A and B, until
a CTRL+C is sent by another thread.