Re: [PATCH v2] aoe: fix the potential use-after-free problem in more places

From: Valentin Kleibel
Date: Thu Sep 12 2024 - 07:37:11 EST


Then Nicolai Stange found more places in aoe have potential use-after-free
problem with tx(). e.g. revalidate(), aoecmd_ata_rw(), resend(), probe()
and aoecmd_cfg_rsp(). Those functions also use aoenet_xmit() to push
packet to tx queue. So they should also use dev_hold() to increase the
refcnt of skb->dev.

We've tested your patch on our servers and ran into an issue.
With heavy I/O load the aoe device had stale I/Os (e.g. rsync waiting indefinetly on one core) that can be "fixed" by running aoe-revalidate on that device.

Additionally when trying to shut down the system we see the message:
unregister_netdevice: waiting for XXX to become free. Usage Count = XXXXX
on aoe devices with a usage count somewhere in the millions.
This has been the same as without the patch, i assume the fix is still incomplete.

Thanks for your work,
Valentin