Re: [PATCH net v4 15/15] rxrpc: fix reference count leak in rxrpc_server_keyring()

From: David Howells

Date: Wed Apr 08 2026 - 06:52:11 EST


Jakub Kicinski <kuba@xxxxxxxxxx> wrote:

> And in return maybe you can scan the AI output and tell me
> if any of it is legit? ;)
>
> https://sashiko.dev/#/patchset/20260401105614.1696001-10-dhowells@xxxxxxxxxx

PATCH 2
-------
The complaint it makes about overflows in relation to patch 2:

net/rxrpc/key.c:rxrpc_preparse_xdr_yfs_rxgk() {
...
keylen = ntohl(key[-1]);
_debug("keylen: %x", keylen);
keylen = round_up(keylen, 4);
...
tktlen = ntohl(ticket[-1]);
_debug("tktlen: %x", tktlen);
tktlen = round_up(tktlen, 4);
...
}

is actually addressed by:

[PATCH net v4 05/15] rxrpc: Fix RxGK token loading to check bounds

I guess it doesn't look forward through the patches?


PATCH 8
-------
In regards to patch 8, it asks:

Does this change introduce an asymmetric behavior between configuring
RXRPC_SECURITY_KEY and RXRPC_SECURITY_KEYRING?

Yes, but it's then correct. An AF_RXRPC server socket can also make client
calls - and, indeed, the kernel AFS client requires this. The kernel AFS
client makes filesystem access/manipulation client calls but also listens for,
amongst other things, change notifications from the server.

With regards to its complaint about patch 12, I should really switch to using
lib/crypto/ to avoid using any memory allocation, but, yes, it has a point and
that crypto_skcipher_decrypt() (and encrypt) can fail too.

"sp->len < 8" is checked for at the beginning of the functions
rxkad_verify_packet_2(), so it must be at least one block in size. rxkad uses
pcbc to wrap fcrypt which should take care of the non-block alignment.

I'll add a patch to add error handling.


PATCH 13
--------
For patch 13, it says:

Are there integer overflows in rxgk_verify_response() when handling
token_len?

Yeah - I do the round up before the check. I'll add a patch to check that the
raw token_len <= len too (len must be < 65536 as the response must fit in a
single UDP packet).

It also says:

Does rxgk_verify_response() leak the rxgk_context structure (gk)?

Yep. Another patch for that.

And:

Does rxgk_do_verify_authenticator() still perform an out-of-bounds
read if the authenticator is smaller than 24 bytes?

Unfortunately, yes. Add one for that.

Further, it says:

Can modifying conn->channels[i].call_counter here cause a data race?

It shouldn't, since the value is only used to allocate the number for a call
that is set in callNumber in the Rx header for that call (well, the number may
also be copied inside the encrypted payload) but it's only *interpreted* by
the peer.

The way Rx works is that there's a separate callNumber space on each channel
on each connection. Only one call can be in progress on a given channel (the
channel number is in the bottom two bits of the connection ID field), and
calls on a channel are numbered consecutively. Seeing a new incoming call
with the next higher callNumber implicitly completes and ACKs the previous
call on that channel (potentially rendering explicit ACK packets unnecessary
on a busy channel).

If, however, a connection becomes idle, the server can just discard its record
of it. Should the client resume activity on that connection, the RESPONSE
packet conveys the client's call counter for each channel, from which the
server reinitialises the counters.

rxrpc_expose_client_call() cranks the counter for the client side;
rxgk_do_verify_authenticator() sets the counter for the server side. This
sets the expectation in a secure environment of what callNumbers should be
next on a connection to help prevent replay attacks.

If the server sees the call ID revert more than 1 (to allow for duplicate
packets from the previous call), it should abort the connection.

So I think nothing needs to be fixed here, as a secure connection isn't
allowed to proceed until the RESPONSE packet is fully parsed and the transport
security set up.


PATCH 14
--------
For patch 14, it says:

While this fixes the panic for auth_len, can a maliciously large
token_len (e.g., 0xFFFFFFFF) cause an integer overflow in the same
function that leads to the exact same BUG_ON() panic?

but this is the same as the first thing it says against patch 13. I've added
a patch to check token_len.

Ditto for:

Does this function also unconditionally leak the rxgk_context
allocated for the transport key?

It then says:

Are there memory leaks and data races if duplicate RESPONSE packets
are processed concurrently?

That can't happen as there's a single work item, conn->processor, that
processes all connection-level events, including CHALLENGE and RESPONSE
issuing and parsing for that connection. Each connection has its own totally
independent security context, so there shouldn't be interference between two
connections.

And finally, it says:

If rxrpc_process_event() does not verify if the connection is already
secured before invoking the verify_response() callback, duplicate
packets would cause rxgk_verify_response() to unconditionally
overwrite conn->key with the new key, leaking the previous key's
reference.

Additionally, if rxgk_init_connection_security() unconditionally
overwrites conn->rxgk.keys[] without putting the old key or holding
conn->security_use_lock, could this cause another rxgk_context leak
and race against concurrent read accesses in rxgk_get_key()?

These are both addressed by a patch I've been sent that I'll add.


PATCH 15
--------
Regarding patch 15, which provides an alternative fix to patch 8, I previously
asked you to drop patch 15 - but I'm thinking now it's probably better to keep
patch 15 and drop patch 8 (and change patch 15 to return -EINVAL).

David