Re: [PATCH net 5/5] afs: Fix uncancelled rxrpc OOB message handler

From: Simon Horman

Date: Thu Jun 18 2026 - 05:34:51 EST

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
afs: Fix uncancelled rxrpc OOB message handler

This commit cancels AFS OOB message processing during shutdown and moves it
to afs_wq. It also prevents queuing if the net namespace is no longer live.

Link: https://sashiko.dev/#/patchset/20260609140911.838677-1-dhowells%40redhat.com

> diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
> index fd2d260fb25f9..6241f9349f6b6 100644
> --- a/fs/afs/rxrpc.c
> +++ b/fs/afs/rxrpc.c
[ ... ]
> @@ -128,6 +128,7 @@ void afs_close_socket(struct afs_net *net)
> _enter("");
>
> cancel_work_sync(&net->charge_preallocation_work);
> + cancel_work_sync(&net->rx_oob_work);
> kernel_listen(net->socket, 0);
> flush_workqueue(afs_async_calls);
> flush_workqueue(afs_wq);

[Severity: High]
This is a pre-existing issue, but does placing the cancellation and flush
before kernel_sock_shutdown() leave a window open for the work to be queued
afterwards?

If kernel_sock_shutdown() is what synchronously disables the notify_oob
callback from firing, couldn't a delayed callback still queue work on afs_wq
after we've already flushed it here?

[ ... ]
> @@ -985,5 +986,6 @@ static void afs_rx_notify_oob(struct sock *sk, struct sk_buff *oob)
> {
> struct afs_net *net = sk->sk_user_data;
>
> - schedule_work(&net->rx_oob_work);
> + if (net->live)
> + queue_work(afs_wq, &net->rx_oob_work);
> }

[Severity: High]
This is a pre-existing issue, but does this lockless check of net->live allow
a race condition that leads to a use-after-free?

Consider this sequence:

CPU1 (teardown)
afs_net_exit()
net->live = false;
afs_close_socket()
cancel_work_sync(&net->rx_oob_work);
flush_workqueue(afs_wq);
kernel_sock_shutdown() // blocks waiting for recvmsg_lock

CPU2 (incoming OOB)
rxrpc_notify_socket_oob()
spin_lock_irq(&rx->recvmsg_lock);
afs_rx_notify_oob()
// Reads net->live == true (stale read, or delayed execution)
queue_work(afs_wq, &net->rx_oob_work);
spin_unlock_irq(&rx->recvmsg_lock);

CPU1 then finishes kernel_sock_shutdown() and frees the net namespace, but
afs_wq is never flushed again.

Would moving cancel_work_sync() and flush_workqueue() to the end of
afs_close_socket(), after kernel_sock_shutdown() completes, prevent this race?