Re: [PATCH] nvmet-tcp: Ensure old keys are freed before replacing new ones

From: Hannes Reinecke

Date: Thu Apr 16 2026 - 02:16:33 EST


On 4/16/26 01:02, alistair23@xxxxxxxxx wrote:
From: Alistair Francis <alistair.francis@xxxxxxx>

Previously after the host sends a REPLACETLSPSK we freed the TLS keys as
part of calling nvmet_auth_sq_free() on success. A recent change ensured
we don't free the keys, allowing REPLACETLSPSK to work.

But that fix results in a kernel memory leak when running

```
nvme_trtype=loop ./check nvme/041 nvme/042 nvme/043 nvme/044 nvme/045 nvme/051 nvme/052
echo scan > /sys/kernel/debug/kmemleak
cat /sys/kernel/debug/kmemleak
```

We can't free the keys on a successful DHCHAP operation, otherwise the
next REPLACETLSPSK will fail, so instead let's free them before we
replace them as part of nvmet_auth_challenge().

This ensures that REPLACETLSPSK works, while also avoiding any memory
leaks.

Fixes: 2e6eb6b277f59 ("nvmet-tcp: Don't free SQ on authentication success")
Signed-off-by: Alistair Francis <alistair.francis@xxxxxxx>
---
drivers/nvme/target/fabrics-cmd-auth.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/drivers/nvme/target/fabrics-cmd-auth.c b/drivers/nvme/target/fabrics-cmd-auth.c
index b9ab80c7a6941..58185184478a4 100644
--- a/drivers/nvme/target/fabrics-cmd-auth.c
+++ b/drivers/nvme/target/fabrics-cmd-auth.c
@@ -412,6 +412,13 @@ static int nvmet_auth_challenge(struct nvmet_req *req, void *d, int al)
int hash_len = nvme_auth_hmac_hash_len(ctrl->shash_id);
int data_size = sizeof(*d) + hash_len;
+ /*
+ * If replacing the keys then we have previous successful keys
+ * that might be leaked, so we need to free them here.
+ */
+ if (req->sq->dhchap_c1)
+ nvmet_auth_sq_free(req->sq);
+
if (ctrl->dh_tfm)
data_size += ctrl->dh_keysize;
if (al < data_size) {
I am not sure.
The authentication variables should be freed as soon as the authentication completes; the session key is ephemeral and
should not be stored longer than necessary and will _never_
be used again once authentication completes.
The TLS key, OTOH, is used throughout the session and needs
to be present while the session is active
As such, both sets have vastly different lifetimes, and
I would argue that this

void nvmet_auth_sq_free(struct nvmet_sq *sq)
{
cancel_delayed_work(&sq->auth_expired_work);
#ifdef CONFIG_NVME_TARGET_TCP_TLS
sq->tls_key = NULL;
#endif
kfree(sq->dhchap_c1);
sq->dhchap_c1 = NULL;

is actually wrong as we should not modify 'tls_key' here.

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich