Re: [PATCH net v2] ipv6: addrconf: skip autoconf on unregistering devices
From: Tetsuo Handa
Date: Thu May 14 2026 - 06:40:58 EST
On 2026/05/14 18:44, Xu Rao wrote:
> The reproducer is the syz repro from the syzbot report:
>
> https://syzkaller.appspot.com/x/repro.syz?x=103f3dba580000
>
> I don't have a standalone C reproducer. I asked syzbot to test the patch
> against the original report tree:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
>
> The reproducer did not trigger the issue with this patch applied:
Will you test your patch with
- if (dev->reg_state == NETREG_UNREGISTERING)
- break;
-
+ WARN_ON(dev->reg_state == NETREG_UNREGISTERING);
and confirm that WARN_ON() fires, for nobody knows whether this refcount
was obtained after dev->reg_state became NETREG_UNREGISTERING ?
Please be aware that the stack traces that follow
unregister_netdevice: waiting for netdevsim1 to become free. Usage count = 2
do not indicate that these references are obtained after the
unregister_netdevice: waiting for netdevsim1 to become free. Usage count = 2
message was printed. The stack traces indicates only where the leaking refcount
was obtained.
>> The kernel repeatedly sends NETDEV_UNREGISTER notifications when it's
>> waiting for the reference count to drop.
It is true that the kernel repeatedly sends NETDEV_UNREGISTER notifications,
but it is not always true that the NETDEV_UNREGISTER handlers works for
every notification. Some NETDEV_UNREGISTER handlers do something for only
the first NETDEV_UNREGISTER notification and is do nothing for the subsequent
NETDEV_UNREGISTER notifications, for a NETDEV_UNREGISTER handler might
unregister cleanup functions (and fails to cleanup resources obtained
afterwards).
My experience says that operations which are not serialized by the RTNL lock is
more prone to this refcount race than the netdev handlers which are serialized
by the RTNL lock. To close such race, we need to check operations which are not
serialized by the RTNL lock. An example was commit 5d5602236f5d ("can: j1939:
make j1939_session_activate() fail if device is no longer registered").
> The guard only affects the MTU / UP / CHANGE paths. The problem I was
> trying to avoid is creating new IPv6 state once the device is already in
> NETREG_UNREGISTERING. In the syzbot trace, addrconf_notify() is reached
> from a NETDEV_CHANGE path and then creates a link-local address and its
> route while unregister is already in progress. The route then holds a
> netdev reference via fib6_nh_init().
>
> So the patch does not rely on suppressing unregister processing; it only
> prevents late autoconf from adding new state during unregister.
Unless you confirmed that WARN_ON() fires, a single successful "#syz test" response
is not sufficient for believing that your patch actually fixes this problem.
I think that we want some debug printk() patches like
https://lore.kernel.org/all/e0c7030b-261c-4ed1-b6b0-bf3b83a41d60@xxxxxxxxxxxxxxxxxxx/T/
in order to get more debug information.