Re: [BUG] KFENCE: use-after-free read in udp_tunnel_nic_device_sync_work
From: Sam Sun
Date: Wed Jun 24 2026 - 09:44:42 EST
On Wed, Jun 24, 2026 at 6:01 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
>
> On Wed, Jun 24, 2026 at 2:01 AM Yue Sun <samsun1006219@xxxxxxxxx> wrote:
> >
> > Hello,
> >
> > I hit a reproducible use-after-free in the UDP tunnel NIC offload work item.
> > The original local crash was reported by KFENCE as:
> >
> > KFENCE: use-after-free read in udp_tunnel_nic_device_sync_work
> >
> > On current mainline, the C reproducer below triggers the same lifetime bug,
> > reported by KASAN before KFENCE samples the object:
> >
> > BUG: KASAN: slab-use-after-free in __mutex_lock
> > Workqueue: udp_tunnel_nic udp_tunnel_nic_device_sync_work
> >
> > Tested kernel:
> >
> > 840ef6c78e6a ("Merge tag 'nfs-for-7.2-1' of git://git.linux-nfs.org/projects/anna/linux-nfs")
> > Linux 7.1.0-11240-g840ef6c78e6a #31 SMP PREEMPT_DYNAMIC
> >
>
>
> Thanks or the report.
>
> Can you test the following patch?
>
> diff --git a/net/ipv4/udp_tunnel_nic.c b/net/ipv4/udp_tunnel_nic.c
> index 9944ed923ddfd10f9adf6ad788c0740daeaf2adb..c5f8d2f9d325de8f4d2247ddaa52e33378851857
> 100644
> --- a/net/ipv4/udp_tunnel_nic.c
> +++ b/net/ipv4/udp_tunnel_nic.c
> @@ -304,8 +304,8 @@ udp_tunnel_nic_device_sync(struct net_device *dev,
> struct udp_tunnel_nic *utn)
> if (!utn->need_sync)
> return;
>
> - queue_work(udp_tunnel_nic_workqueue, &utn->work);
> utn->work_pending = 1;
> + queue_work(udp_tunnel_nic_workqueue, &utn->work);
> }
>
> static bool
> @@ -866,6 +866,11 @@ udp_tunnel_nic_unregister(struct net_device *dev,
> struct udp_tunnel_nic *utn)
>
> udp_tunnel_nic_lock(dev);
>
> + if (utn->work_pending) {
> + udp_tunnel_nic_unlock(dev);
> + return;
> + }
> +
> /* For a shared table remove this dev from the list of sharing devices
> * and if there are other devices just detach.
> */
> @@ -901,12 +906,6 @@ udp_tunnel_nic_unregister(struct net_device *dev,
> struct udp_tunnel_nic *utn)
> udp_tunnel_nic_flush(dev, utn);
> udp_tunnel_nic_unlock(dev);
>
> - /* Wait for the work to be done using the state, netdev core will
> - * retry unregister until we give up our reference on this device.
> - */
> - if (utn->work_pending)
> - return;
> -
> udp_tunnel_nic_free(utn);
> release_dev:
> dev->udp_tunnel_nic = NULL;
I tested the patch, but unfortunately the C reproducer still triggers the
same use-after-free for me.
Tested on top of:
840ef6c78e6a ("Merge tag 'nfs-for-7.2-1' of
git://git.linux-nfs.org/projects/anna/linux-nfs")
I booted the kernel with KASAN/KFENCE enabled and:
panic_on_warn=1 panic_on_oops=1 kfence.sample_interval=1
Then I ran the same C reproducer:
timeout -k 10 360 /root/repro
The VM panicked after about 236 seconds:
[ 236.471119][ T58] BUG: KASAN: slab-use-after-free in
__mutex_lock+0x16d0/0x1d80
[ 236.473404][ T58] Read of size 8 at addr ff11000076a63ea8 by task
kworker/u16:3/58
[ 236.476455][ T58] Hardware name: QEMU Standard PC (i440FX + PIIX,
1996), BIOS 1.15.0-1 04/01/2014
[ 236.476478][ T58] Workqueue: udp_tunnel_nic udp_tunnel_nic_device_sync_work
[ 236.476787][ T58] __mutex_lock+0x16d0/0x1d80
[ 236.477020][ T58] udp_tunnel_nic_device_sync_work+0x32/0x9c0
[ 236.477068][ T58] process_one_work+0x9de/0x1bf0
The allocation/free stacks are still the same shape:
```
Allocated by task 11563:
__kmalloc_noprof
udp_tunnel_nic_netdevice_event+0x12d8/0x1e80
register_netdevice
nsim_create
nsim_dev_reload_up
devlink_reload
Freed by task 11609:
kfree
udp_tunnel_nic_netdevice_event+0xc26/0x1e80
unregister_netdevice_many_notify
nsim_destroy
nsim_dev_reload_down
devlink_reload
Last potentially related work creation:
queue_work_on
__udp_tunnel_nic_del_port+0x2af/0x320
udp_tunnel_notify_del_rx_port
__geneve_sock_release.part.0
geneve_stop
Second to last potentially related work creation:
queue_work_on
__udp_tunnel_nic_add_port+0x6ec/0xd70
udp_tunnel_notify_add_rx_port
geneve_open
```
My read of the patch is that it closes the small window where queue_work()
can publish the work before utn->work_pending is set, and it also prevents
udp_tunnel_nic_unregister() from flushing/freeing the object when
work_pending is already set.
However, the test above suggests that work_pending still does not fully
protect the lifetime of struct udp_tunnel_nic. The crashing work was still
queued through udp_tunnel_nic_device_sync() at line 308, so the patched path
was exercised. One suspicious point is that udp_tunnel_nic_device_sync_work()
clears utn->work_pending at the beginning of the worker, while the same work
item can still interact with replay/add/del-port state. The reproducer can
still end up with udp_tunnel_nic_unregister() freeing utn while a
udp_tunnel_nic_device_sync_work item later runs and dereferences the freed
utn->lock.
So this patch does not seem to be sufficient for this reproducer.
Thanks,
Yue