Re: [PATCH] mptcp: serialize subflow->closing with RX path
From: Paolo Abeni
Date: Thu May 07 2026 - 13:11:19 EST
On 5/7/26 9:28 AM, Kalpan Jani wrote:
> There is a race between mptcp_data_ready() (RX path) and
> mptcp_close_ssk() (teardown path) when accessing subflow->closing.
>
> Currently, mptcp_data_ready() checks subflow->closing before acquiring
> mptcp_data_lock(), while mptcp_close_ssk() may concurrently set
> subflow->closing and
Are you sure this race can really happen? both the relevant part of
__mptcp_close_ssk() and mptcp_data_ready() run under the ssk socket
lock.
> @@ -2653,9 +2673,12 @@ void mptcp_close_ssk(struct sock *sk, struct sock *ssk,
> if (sk->sk_state == TCP_ESTABLISHED)
> mptcp_event(MPTCP_EVENT_SUB_CLOSED, mptcp_sk(sk), ssk, GFP_KERNEL);
>
> - /* Remove any reference from the backlog to this ssk; backlog skbs consume
> + /* Remove any reference from the backlog to this ssk.
> + * Serialize cleanup with RX-side enqueue using mptcp_data_lock().
> + * Backlog skbs consume
> * space in the msk receive queue, no need to touch sk->sk_rmem_alloc
> */
> + mptcp_data_lock(sk);
> list_for_each_entry(skb, &msk->backlog_list, list) {
> if (skb->sk != ssk)
> continue;
The real problem is here: the backlog is currently traversed without the
data lock (wrong: the mptcp_data_lock() protects backlog updates), while
the ssk is still possibly open, unlocked and can keep receiving packets
and adding them to the BL.
A better solution would be something alike the following patch (completely
untested):
---
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 718e910ff23f..68d97926cb81 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2550,6 +2550,21 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
lock_sock_nested(ssk, SINGLE_DEPTH_NESTING);
subflow->closing = 1;
+ /* Remove any reference from the backlog to this ssk; backlog skbs consume
+ * space in the msk receive queue, no need to touch sk->sk_rmem_alloc
+ */
+ if (flags & MPTCP_CF_PUSH) {
+ mptcp_data_lock(sk);
+ list_for_each_entry(skb, &msk->backlog_list, list) {
+ if (skb->sk != ssk)
+ continue;
+
+ atomic_sub(skb->truesize, &skb->sk->sk_rmem_alloc);
+ skb->sk = NULL;
+ }
+ mptcp_data_unlock(sk);
+ }
+
/* Borrow the fwd allocated page left-over; fwd memory for the subflow
* could be negative at this point, but will be reach zero soon - when
* the data allocated using such fragment will be freed.
@@ -2653,17 +2668,6 @@ void mptcp_close_ssk(struct sock *sk, struct sock *ssk,
if (sk->sk_state == TCP_ESTABLISHED)
mptcp_event(MPTCP_EVENT_SUB_CLOSED, mptcp_sk(sk), ssk, GFP_KERNEL);
- /* Remove any reference from the backlog to this ssk; backlog skbs consume
- * space in the msk receive queue, no need to touch sk->sk_rmem_alloc
- */
- list_for_each_entry(skb, &msk->backlog_list, list) {
- if (skb->sk != ssk)
- continue;
-
- atomic_sub(skb->truesize, &skb->sk->sk_rmem_alloc);
- skb->sk = NULL;
- }
-
/* subflow aborted before reaching the fully_established status
* attempt the creation of the next subflow
*/