Re: [PATCH net] tipc: avoid sending zero-length stream messages
From: Cássio Gabriel Monteiro Pires
Date: Wed May 06 2026 - 21:52:58 EST
Hi!
On 5/6/26 03:41, Tung Quang Nguyen wrote:
>> Subject: [PATCH net] tipc: avoid sending zero-length stream messages
>>
>> TIPC stream send currently enters the transmit loop even when the user
>> payload length is zero. This can build and transmit a header-only connection
>> message.
>>
>> For local TIPC sockets, such messages are delivered synchronously through the
>> loopback receive path. When this happens while socket backlog processing is
>> being flushed, reply transmission can re-enter TIPC receive processing
>> repeatedly and trigger an RCU stall.
>>
> Can you demonstrate this scenario using code ? It is better to point out what current code is faulty.
The minimized user-visible trigger is essentially:
int fd[2];
struct msghdr msg = {};
socketpair(AF_TIPC, SOCK_STREAM, 0, fd);
/* In parallel, this makes release_sock() flush backlog. */
setsockopt(fd[0], SOL_SOCKET, SO_ATTACH_BPF, &bad_fd,
sizeof(bad_fd));
/* Repeated zero-length MSG_PROBE send on the connected peer. */
for (i = 0; i < 64; i++)
sendmsg(fd[1], &msg, MSG_PROBE | MSG_MORE);
The faulty current-code path is that TIPC stream send does not handle
MSG_PROBE before entering __tipc_sendstream(). MSG_PROBE is supposed to
probe without transmitting data, but the call reaches __tipc_sendstream()
with dlen == 0.
__tipc_sendstream() uses a do/while loop, so even when dlen is 0 the body
runs once:
send = min_t(size_t, dlen - sent, TIPC_MAX_USER_MSG_SIZE);
At that point send is 0, but the code can still call tipc_msg_append() or
tipc_msg_build(), creating a TIPC connection message with only the header.
It then calls:
tipc_node_xmit(net, txq, dnode, tsk->portid);
For a local TIPC socketpair, tipc_node_xmit() takes the in_own_node() path
and synchronously calls tipc_sk_rcv(). When this happens while
release_sock() is processing backlog, the receive path can generate
response traffic through tipc_node_distr_xmit(), which re-enters the same
local receive path.
I should have made that explicit in the changelog and pointed at the
missing MSG_PROBE handling as the faulty part.
>>
>> diff --git a/net/tipc/socket.c b/net/tipc/socket.c index
>> 9329919fb07f..3c7838713d74 100644
>> --- a/net/tipc/socket.c
>> +++ b/net/tipc/socket.c
>> @@ -1585,6 +1585,8 @@ static int __tipc_sendstream(struct socket *sock,
>> struct msghdr *m, size_t dlen)
>> tipc_sk_connected(sk)));
>> if (unlikely(rc))
>> break;
>> + if (unlikely(!dlen && sk->sk_type == SOCK_STREAM))
>> + break;
> This change is wrong. It immediately breaks normal connection set up because the ACK (zero in length) has no chance to be sent back from the server to the client.
> Please try to test your patch before submission.
I did test the patch with the syzkaller C repro under QEMU for 10 minutes, and
it did not trigger the reported RCU stall:
/tmp/repro & pid=$!; sleep 600; kill $pid
dmesg | grep -Ei 'rcu.*stall|rcu_preempt|soft lockup|panic|BUG|WARNING' (attached)
The dmesg check did not show any repro-triggered RCU stall, soft lockup,
panic, BUG, or WARNING. But that test only covered the syzkaller trigger;
it did not cover normal active/passive TIPC stream connection setup, which
your review points out is broken by this version.
I re-checked the TIPC connection setup path as well.
tipc_accept() intentionally sends the server-side ACK as a zero-length
stream message:
iov_iter_kvec(&m.msg_iter, ITER_SOURCE, NULL, 0, 0);
__tipc_sendstream(new_sock, &m, 0);
So blocking all zero-length sends inside __tipc_sendstream() prevents
that ACK from being transmitted and can break normal SOCK_STREAM
connection setup.
After re-checking the syzkaller repro, the real trigger seems to be narrower
than zero-length stream send. The repro uses a user sendmsg() with
MSG_PROBE | MSG_MORE and no payload on an already connected TIPC stream
socket. MSG_PROBE is supposed to probe without sending, but TIPC stream
send currently lets that path reach __tipc_sendstream(), where the
do/while body can still run once with dlen == 0 and build/transmit a
header-only message.
I think we should avoid suppressing the internal __tipc_sendstream() ACK path
and instead handle the user-originated zero-length MSG_PROBE case before it
reaches the internal stream send helper.
The v2 fix would look like this:
-- 8< --
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 9329919fb07f..4783df337971 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -1542,6 +1542,10 @@ static int tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dsz)
struct sock *sk = sock->sk;
int ret;
+ /* MSG_PROBE asks only to probe the path, not to transmit data. */
+ if (unlikely((m->msg_flags & MSG_PROBE) && !dsz))
+ return 0;
+
lock_sock(sk);
ret = __tipc_sendstream(sock, m, dsz);
release_sock(sk);
-- >8 --
I tested the reworked patch with the syzkaller C reproducer under QEMU.
The reproducer was run for 10 minutes:
/tmp/repro & pid=$!; sleep 600; kill $pid
dmesg | grep -Ei 'rcu.*stall|rcu_preempt|soft lockup|panic|BUG|WARNING' (attached)
The grep only matched boot-time command-line/debug messages; no
repro-triggered RCU stall, soft lockup, panic, BUG, or WARNING appeared.
What you think?# dmesg | grep -Ei 'rcu.*stall|rcu_preempt|soft lockup|panic|BUG|WARNING'
[ 0.000000][ T0] net.ifnames=0 panic_on_warn=1
[ 0.000000][ T0] Kernel command line: earlyprintk=serial net.ifnames=0 sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000 nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000 nf-conntrack-sane.ports=20000 binder.debug_mask=0 rcupdate.rcu_expedited=1 rcupdate.rcu_cpu_stall_cputime=1 no_hash_pointers page_owner=on sysctl.vm.nr_hugepages=4 sysctl.vm.nr_overcommit_hugepages=4 secretmem.enable=1 sysctl.max_rcu_stall_to_panic=1 msr.allow_writes=off coredump_filter=0xffff root=/dev/sda console=ttyS0 vsyscall=native numa=fake=2 kvm-intel.nested=1 spec_store_bypass_disable=prctl nopcid vivid.n_devs=64 vivid.multiplanar=1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2 netrom.nr_ndevs=32 rose.rose_ndevs=32 smp.csd_lock_timeout=100000 watchdog_thresh=55 workqueue.watchdog_thresh=140 sysctl.net.core.netdev_unregister_timeout_secs=140 dummy_hcd.num=32 max_loop=32 nbds_max=32 \
[ 0.000000][ T0] Kernel command line: comedi.comedi_num_legacy_minors=4 panic_on_warn=1 console=ttyS0 root=/dev/vda1 rootfstype=ext4 rw earlyprintk=serial
[ 0.000000][ T0] net.ifnames=0 panic_on_warn=1
[ 0.000000][ T0] ** If you see this message and you are not debugging **
[ 0.000000][ T0] rcu: RCU callback double-/use-after-free debug is enabled.
[ 0.000000][ T0] rcu: RCU debug extended QS entry/exit.
[ 10.704615][ T1] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[ 21.826838][ T1] orangefs_debugfs_init: called with debug mask: :none: :0:
[ 22.032237][ T1] SGI XFS with ACLs, security attributes, realtime, quota, no debug enabled
[ 77.296497][ T1] usbcore: registered new interface driver usb_debug
[ 77.309604][ T1] usbserial: USB Serial support registered for debug
[ 114.238149][ T1] pvrusb2: Debug mask is 31 (0x1f)
[ 181.100641][ T1] debug_vm_pgtable: [debug_vm_pgtable ]: Validating architecture page table helpers
[ 201.556741][ T1] Failed to set sysctl parameter 'max_rcu_stall_to_panic=1': parameter not found
[1]+ Terminated /tmp/repro
# dmesg | grep -Ei 'rcu.*stall|rcu_preempt|soft lockup|panic|BUG|WARNING'
[ 0.000000][ T0] Command line: console=ttyS0 root=/dev/vda1 rootfstype=ext4 rw earlyprintk=serial net.ifnames=0 panic_on_warn=1
[ 1.462430][ T0] Kernel command line: earlyprintk=serial net.ifnames=0 sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000 nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000 nf-conntrack-sane.ports=20000 binder.debug_mask=0 rcupdate.rcu_expedited=1 rcupdate.rcu_cpu_stall_cputime=1 no_hash_pointers page_owner=on sysctl.vm.nr_hugepages=4 sysctl.vm.nr_overcommit_hugepages=4 secretmem.enable=1 sysctl.max_rcu_stall_to_panic=1 msr.allow_writes=off coredump_filter=0xffff root=/dev/sda console=ttyS0 vsyscall=native numa=fake=2 kvm-intel.nested=1 spec_store_bypass_disable=prctl nopcid vivid.n_devs=64 vivid.multiplanar=1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2 netrom.nr_ndevs=32 rose.rose_ndevs=32 smp.csd_lock_timeout=100000 watchdog_thresh=55 workqueue.watchdog_thresh=140 sysctl.net.core.netdev_unregister_timeout_secs=140 dummy_hcd.num=32 max_loop=32 nbds_max=32 \
[ 1.470761][ T0] Kernel command line: comedi.comedi_num_legacy_minors=4 panic_on_warn=1 console=ttyS0 root=/dev/vda1 rootfstype=ext4 rw earlyprintk=serial net.ifnames=0 panic_on_warn=1
[ 3.155914][ T0] ** If you see this message and you are not debugging **
[ 3.813298][ T0] rcu: RCU callback double-/use-after-free debug is enabled.
[ 3.814645][ T0] rcu: RCU debug extended QS entry/exit.
[ 17.096163][ T1] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[ 28.566521][ T1] orangefs_debugfs_init: called with debug mask: :none: :0:
[ 28.796190][ T1] SGI XFS with ACLs, security attributes, realtime, quota, no debug enabled
[ 84.486523][ T1] usbcore: registered new interface driver usb_debug
[ 84.504286][ T1] usbserial: USB Serial support registered for debug
[ 114.419251][ T1] pvrusb2: Debug mask is 31 (0x1f)
[ 179.396180][ T1] debug_vm_pgtable: [debug_vm_pgtable ]: Validating architecture page table helpers
[ 185.907359][ T1] Failed to set sysctl parameter 'max_rcu_stall_to_panic=1': parameter not found
[1]+ Terminated /tmp/repro
Attachment:
OpenPGP_signature.asc
Description: OpenPGP digital signature