Re: [BUG] rxrpc: Client connection leak and BUG() call during kernel IO thread exit

From: arjan

Date: Thu Apr 23 2026 - 10:47:04 EST


From: Arjan van de Ven <arjan@xxxxxxxxxxxxxxx>

This email is created by automation to help kernel developers deal
with a large volume of AI generated bug reports by decoding oopses
into more actionable information.


Decoded Backtrace

--- rxrpc_destroy_client_conn_ids (inlined into rxrpc_purge_client_connections)
Source: net/rxrpc/conn_client.c

54 static void rxrpc_destroy_client_conn_ids(struct rxrpc_local *local)
55 {
56 struct rxrpc_connection *conn;
57 int id;
58
59 if (!idr_is_empty(&local->conn_ids)) {
60 idr_for_each_entry(&local->conn_ids, conn, id) {
61 pr_err("AF_RXRPC: Leaked client conn %p {%d}\n",
62 conn, refcount_read(&conn->ref));
63 }
64 BUG(); // <- crash here
65 }
66
67 idr_destroy(&local->conn_ids);
68 }

--- rxrpc_destroy_local
Source: net/rxrpc/local_object.c

420 void rxrpc_destroy_local(struct rxrpc_local *local)
421 {
422 struct socket *socket = local->socket;
423 struct rxrpc_net *rxnet = local->rxnet;
...
427 local->dead = true;
...
433 rxrpc_clean_up_local_conns(local);
434 rxrpc_service_connection_reaper(&rxnet->service_conn_reaper);
435 ASSERT(!local->service);
...
450 rxrpc_purge_queue(&local->rx_queue);
451 rxrpc_purge_client_connections(local); // <- call here
452 page_frag_cache_drain(&local->tx_alloc);
453 }

--- rxrpc_io_thread
Source: net/rxrpc/io_thread.c

554 if (!list_empty(&local->new_client_calls))
555 rxrpc_connect_client_calls(local);
...
569 if (should_stop)
570 break;
...
596 __set_current_state(TASK_RUNNING);
598 rxrpc_destroy_local(local); // <- call here
601 return 0;


Tentative Analysis

The crash fires the unconditional BUG() at net/rxrpc/conn_client.c:64
because local->conn_ids is non-empty when rxrpc_destroy_local() is
called by the krxrpcio I/O thread during socket teardown.

When a client sendmsg() queues a call, the I/O thread picks it up via
rxrpc_connect_client_calls(). That function allocates a client
connection (rxrpc_alloc_client_connection()), registers it in the
local->conn_ids IDR with refcount=1, stores it in bundle->conns[], and
moves the call from new_client_calls to bundle->waiting_calls.

Once new_client_calls is empty and kthread_should_stop() is true, the
I/O thread exits its loop and calls rxrpc_destroy_local(). Inside that
function, rxrpc_clean_up_local_conns() iterates only the
local->idle_client_conns list. A connection that is in bundle->conns[]
but has never been activated on a channel (and thus never went idle) is
completely missed. rxrpc_purge_client_connections() then finds the
connection still registered in conn_ids and fires BUG().

The coverage gap was introduced by commit 9d35d880e0e4 ("rxrpc: Move
client call connection to the I/O thread"), which created a new
"allocated in bundle, not yet idle" state for connections that the
existing idle-list cleanup does not handle.

Note: fc9de52de38f ("rxrpc: Fix missing locking causing hanging calls"),
already present in 6.18.13, fixes a related missing-lock bug in the
same code area but does not address this idle-list coverage gap.


Potential Solution

rxrpc_clean_up_local_conns() should be extended to also release
connections stored in bundle->conns[] that have not yet appeared on
idle_client_conns. After the existing idle-list loop, the function
should iterate over all entries in local->client_bundles (the RB-tree
of active bundles), call rxrpc_unbundle_conn() on each occupied
bundle->conns[] slot, and put the connection. This ensures
rxrpc_destroy_client_conn_ids() always finds an empty IDR.


More information

Oops-Analysis: http://oops.fenrus.org/reports/lkml/CAPhRvkyZGKHRTBhV3P2PCCRxmRKGEvJQ0W5a9SMW3qwS2hp2Qw/
Assisted-by: GitHub-Copilot:claude-sonnet-4.6 linux-kernel-oops-x86.