Re: [PATCH net-next] trace: tcp: Add tracepoint for tcp_sendmsg()

From: Breno Leitao
Date: Wed Feb 26 2025 - 11:13:08 EST


Hello David,

On Mon, Feb 24, 2025 at 12:16:04PM -0700, David Ahern wrote:
> On 2/24/25 12:03 PM, Eric Dumazet wrote:
> > On Mon, Feb 24, 2025 at 7:24 PM Breno Leitao <leitao@xxxxxxxxxx> wrote:
> >>
> >> Add a lightweight tracepoint to monitor TCP sendmsg operations, enabling
> >> the tracing of TCP messages being sent.
> >>
> >> Meta has been using BPF programs to monitor this function for years,
> >> indicating significant interest in observing this important
> >> functionality. Adding a proper tracepoint provides a stable API for all
> >> users who need visibility into TCP message transmission.
> >>
> >> The implementation uses DECLARE_TRACE instead of TRACE_EVENT to avoid
> >> creating unnecessary trace event infrastructure and tracefs exports,
> >> keeping the implementation minimal while stabilizing the API.
> >>
> >> Given that this patch creates a rawtracepoint, you could hook into it
> >> using regular tooling, like bpftrace, using regular rawtracepoint
> >> infrastructure, such as:
> >>
> >> rawtracepoint:tcp_sendmsg_tp {
> >> ....
> >> }
> >
> > I would expect tcp_sendmsg() being stable enough ?
> >
> > kprobe:tcp_sendmsg {
> > }
>
> Also, if a tracepoint is added, inside of tcp_sendmsg_locked would cover
> more use cases (see kernel references to it).

Agree, this seems to provide more useful information

> We have a patch for a couple years now with a tracepoint inside the

Sorry, where do you have this patch? is it downstream?

> while (msg_data_left(msg)) {
> }
>
> loop which is more useful than just entry to sendmsg.

Do you mean something like the following?

diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index 1a40c41ff8c30..23318e252d6b9 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -259,6 +259,11 @@ TRACE_EVENT(tcp_retransmit_synack,
__entry->saddr_v6, __entry->daddr_v6)
);

+DECLARE_TRACE(tcp_sendmsg_tp,
+ TP_PROTO(const struct sock *sk, const struct msghdr *msg, size_t size, ssize_t copied),
+ TP_ARGS(sk, msg, size, copied)
+);
+
DECLARE_TRACE(tcp_cwnd_reduction_tp,
TP_PROTO(const struct sock *sk, int newly_acked_sacked,
int newly_lost, int flag),
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 08d73f17e8162..5fcef82275d4a 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1290,6 +1290,8 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
sk_mem_charge(sk, copy);
}

+ trace_tcp_sendmsg_tp(sk, msg, size, copy);
+
if (!copied)
TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_PSH;