Re: [PATCH net] net: annotate data race around dev->flags in __dev_change_flags

From: Eric Dumazet
Date: Mon Jul 29 2024 - 13:46:45 EST


On Mon, Jul 29, 2024 at 6:34 PM Jeongjun Park <aha310510@xxxxxxxxx> wrote:
>
> According to KCSAN report, there is a read/write race between
> __dev_change_flags and netif_is_bond_master for dev->flags.
>
> Thereforce, __dev_change_flags() needs protection.
>
> <syzbot>
> BUG: KCSAN: data-race in __dev_change_flags / is_upper_ndev_bond_master_filter
>
> read-write to 0xffff888112d970b0 of 4 bytes by task 4888 on cpu 0:
> __dev_change_flags+0x9a/0x410 net/core/dev.c:8755
> rtnl_configure_link net/core/rtnetlink.c:3321 [inline]
> rtnl_newlink_create net/core/rtnetlink.c:3518 [inline]
> __rtnl_newlink net/core/rtnetlink.c:3730 [inline]
> rtnl_newlink+0x121e/0x1690 net/core/rtnetlink.c:3743
> rtnetlink_rcv_msg+0x85e/0x910 net/core/rtnetlink.c:6635
> netlink_rcv_skb+0x12c/0x230 net/netlink/af_netlink.c:2564
> rtnetlink_rcv+0x1c/0x30 net/core/rtnetlink.c:6653
> netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
> netlink_unicast+0x58d/0x660 net/netlink/af_netlink.c:1361
> netlink_sendmsg+0x5ca/0x6e0 net/netlink/af_netlink.c:1905
> sock_sendmsg_nosec net/socket.c:730 [inline]
> __sock_sendmsg+0x140/0x180 net/socket.c:745
> ____sys_sendmsg+0x312/0x410 net/socket.c:2585
> ___sys_sendmsg net/socket.c:2639 [inline]
> __sys_sendmsg+0x1e9/0x280 net/socket.c:2668
> __do_sys_sendmsg net/socket.c:2677 [inline]
> __se_sys_sendmsg net/socket.c:2675 [inline]
> __x64_sys_sendmsg+0x46/0x50 net/socket.c:2675
> x64_sys_call+0xb25/0x2d70 arch/x86/include/generated/asm/syscalls_64.h:47
> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> read to 0xffff888112d970b0 of 4 bytes by task 11 on cpu 1:
> netif_is_bond_master include/linux/netdevice.h:5020 [inline]
> is_upper_ndev_bond_master_filter+0x2b/0xb0 drivers/infiniband/core/roce_gid_mgmt.c:275
> ib_enum_roce_netdev+0x124/0x1d0 drivers/infiniband/core/device.c:2310
> ib_enum_all_roce_netdevs+0x8a/0x100 drivers/infiniband/core/device.c:2337
> netdevice_event_work_handler+0x15b/0x3c0 drivers/infiniband/core/roce_gid_mgmt.c:626
> process_one_work kernel/workqueue.c:3248 [inline]
> process_scheduled_works+0x483/0x9a0 kernel/workqueue.c:3329
> worker_thread+0x526/0x720 kernel/workqueue.c:3409
> kthread+0x1d1/0x210 kernel/kthread.c:389
> ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>
> value changed: 0x00001002 -> 0x00000202
>
> Reported-by: syzbot+113b65786d8662e21ff7@xxxxxxxxxxxxxxxxxxxxxxxxx
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Jeongjun Park <aha310510@xxxxxxxxx>
> ---
> net/core/dev.c | 17 +++++++++--------
> 1 file changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 6ea1d20676fb..3b9626cdfd9a 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -8799,7 +8799,7 @@ EXPORT_SYMBOL(dev_get_flags);
> int __dev_change_flags(struct net_device *dev, unsigned int flags,
> struct netlink_ext_ack *extack)
> {
> - unsigned int old_flags = dev->flags;
> + unsigned int old_flags = READ_ONCE(dev->flags);
> int ret;
>
> ASSERT_RTNL();
> @@ -8808,12 +8808,13 @@ int __dev_change_flags(struct net_device *dev, unsigned int flags,
> * Set the flags on our device.
> */
>
> - dev->flags = (flags & (IFF_DEBUG | IFF_NOTRAILERS | IFF_NOARP |
> - IFF_DYNAMIC | IFF_MULTICAST | IFF_PORTSEL |
> - IFF_AUTOMEDIA)) |
> - (dev->flags & (IFF_UP | IFF_VOLATILE | IFF_PROMISC |
> - IFF_ALLMULTI));
> + unsigned int new_flags = (flags & (IFF_DEBUG | IFF_NOTRAILERS | IFF_NOARP |
> + IFF_DYNAMIC | IFF_MULTICAST | IFF_PORTSEL |
> + IFF_AUTOMEDIA)) |
> + (READ_ONCE(dev->flags) & (IFF_UP | IFF_VOLATILE | IFF_PROMISC |
> + IFF_ALLMULTI));
>
> + WRITE_ONCE(dev->flags, new_flags);
> /*
> * Load in the correct multicast list now the flags have changed.
> */
> @@ -8839,12 +8840,12 @@ int __dev_change_flags(struct net_device *dev, unsigned int flags,
>
> if ((flags ^ dev->gflags) & IFF_PROMISC) {
> int inc = (flags & IFF_PROMISC) ? 1 : -1;
> - unsigned int old_flags = dev->flags;
> + unsigned int old_flags = READ_ONCE(dev->flags);
>
> dev->gflags ^= IFF_PROMISC;
>
> if (__dev_set_promiscuity(dev, inc, false) >= 0)
> - if (dev->flags != old_flags)
> + if (READ_ONCE(dev->flags) != old_flags)
> dev_set_rx_mode(dev);
> }
>

These READ_ONCE() in RTNL protected regions are not necessary, because
dev->flags can not be changed by another thread.