Re: [PATCH net] mptcp: fix soft lockup in mptcp_recvmsg()

From: Matthieu Baerts

Date: Tue Mar 03 2026 - 13:10:38 EST


Hi Li,

On 02/03/2026 06:26, Li Xiasong wrote:
> syzbot reported a soft lockup in mptcp_recvmsg() [0].
>
> When receiving data with MSG_PEEK | MSG_WAITALL flags, the skb is not
> removed from the sk_receive_queue. This causes sk_wait_data() to always
> find available data and never perform actual waiting, leading to a soft
> lockup.
>
> Fix this by adding a 'last' parameter to track the last peeked skb.
> This allows sk_wait_data() to make informed waiting decisions and prevent
> infinite loops when MSG_PEEK is used.

(...)

> Fixes: 612f71d7328c ("mptcp: fix possible stall on recvmsg()")
> Signed-off-by: Li Xiasong <lixiasong1@xxxxxxxxxx>
> ---
> net/mptcp/protocol.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index cf1852b99963..7a65c2101f63 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -2006,7 +2006,7 @@ static void mptcp_eat_recv_skb(struct sock *sk, struct sk_buff *skb)
> static int __mptcp_recvmsg_mskq(struct sock *sk, struct msghdr *msg,
> size_t len, int flags, int copied_total,
> struct scm_timestamping_internal *tss,
> - int *cmsg_flags)
> + int *cmsg_flags, struct sk_buff **last)
> {
> struct mptcp_sock *msk = mptcp_sk(sk);
> struct sk_buff *skb, *tmp;
> @@ -2058,6 +2058,8 @@ static int __mptcp_recvmsg_mskq(struct sock *sk, struct msghdr *msg,
> }
>
> mptcp_eat_recv_skb(sk, skb);
> + } else {
> + *last = skb;

Out of curiosity, why only setting *last for MSG_PEEK? Is it not better
to always call sk_wait_data() later with the last skb, even when
MSG_PEEK is not used?

Or will this cause other troubles?

> }
>
> if (copied >= len)
> @@ -2263,6 +2265,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
> {
> struct mptcp_sock *msk = mptcp_sk(sk);
> struct scm_timestamping_internal tss;
> + struct sk_buff *last = NULL;

Detail: the scope of this variable could eventually be reduced by moving
it inside the while-loop. This should hopefully help to reduce conflicts
during backports.

> int copied = 0, cmsg_flags = 0;
> int target;
> long timeo;
> @@ -2291,7 +2294,8 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
> int err, bytes_read;
>
> bytes_read = __mptcp_recvmsg_mskq(sk, msg, len - copied, flags,
> - copied, &tss, &cmsg_flags);
> + copied, &tss, &cmsg_flags,
> + &last);
> if (unlikely(bytes_read < 0)) {
> if (!copied)
> copied = bytes_read;
> @@ -2343,7 +2347,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
>
> pr_debug("block timeout %ld\n", timeo);
> mptcp_cleanup_rbuf(msk, copied);
> - err = sk_wait_data(sk, &timeo, NULL);
> + err = sk_wait_data(sk, &timeo, last);
> if (err < 0) {
> err = copied ? : err;
> goto out_err;
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.