Re: [PATCH net v2] net-sysfs: check device is present when showing duplex

From: Shigeru Yoshida
Date: Tue Jul 30 2024 - 22:58:44 EST


Hi Jamie,

On Mon, 29 Jul 2024 10:12:10 +1000, Jamie Bainbridge wrote:
> A sysfs reader can race with a device reset or removal, attempting to
> read device state when the device is not actuall present.
>
> This is the same sort of panic as observed in commit 4224cfd7fb65
> ("net-sysfs: add check for netdevice being present to speed_show"):
>
> [exception RIP: qed_get_current_link+17]
> #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
> #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
> #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
> #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
> #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
> #13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
> #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
> #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
> #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
> #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb
>
> crash> struct net_device.state ffff9a9d21336000
> state = 5,
>
> state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
> The device is not present, note lack of __LINK_STATE_PRESENT (0b10).
>
> Resolve by adding the same netif_device_present() check to duplex_show.
>
> Fixes: 8ae6daca85c8 ("ethtool: Call ethtool's get/set_settings callbacks with cleaned data")
> Signed-off-by: Jamie Bainbridge <jamie.bainbridge@xxxxxxxxx>
> ---
> v2: Restrict patch to just required path and describe problem in more
> detail as suggested by Johannes Berg. Improve commit message format
> as suggested by Shigeru Yoshida.
> ---
> net/core/net-sysfs.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> index 0e2084ce7b7572bff458ed7e02358d9258c74628..22801d165d852a6578ca625b9674090519937be5 100644
> --- a/net/core/net-sysfs.c
> +++ b/net/core/net-sysfs.c
> @@ -261,7 +261,7 @@ static ssize_t duplex_show(struct device *dev,
> if (!rtnl_trylock())
> return restart_syscall();
>
> - if (netif_running(netdev)) {
> + if (netif_running(netdev) && netif_device_present(netdev)) {
> struct ethtool_link_ksettings cmd;
>
> if (!__ethtool_get_link_ksettings(netdev, &cmd)) {

As for the qede driver mentioned in the commit log, I assume the race
was caused between duplex_show() and qede_recovery_handler().
qede_recovery_handler() clears __LINK_STATE_PRESENT on recovery
failure and it is called with rtnl lock, so I think the patch works
correctly.

As Paolo mentioned, I think the issue was introduced when
duplex_show()/show_duplex() was first introduced.

Anyway,

Reviewed-by: Shigeru Yoshida <syoshida@xxxxxxxxxx>

> --
> 2.39.2
>
>