Re: [PATCH] net: tcp_drop adds `reason` parameter for tracing v2

From: Eric Dumazet
Date: Wed Sep 01 2021 - 11:21:16 EST


On Wed, Sep 1, 2021 at 7:36 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> On Thu, 26 Aug 2021 15:13:07 +1000
> Brendan Gregg <brendan.d.gregg@xxxxxxxxx> wrote:
>
> > On Thu, Aug 26, 2021 at 1:20 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> > >
> > > On Wed, 25 Aug 2021 08:47:46 -0700
> > > Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
> > >
> > > > > @@ -5703,15 +5700,15 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb,
> > > > > TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS);
> > > > > NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNCHALLENGE);
> > > > > tcp_send_challenge_ack(sk, skb);
> > > > > - goto discard;
> > > > > + tcp_drop(sk, skb, TCP_DROP_MASK(__LINE__, TCP_VALIDATE_INCOMING));
> > > >
> > > > I'd rather use a string. So that we can more easily identify _why_ the
> > > > packet was drop, without looking at the source code
> > > > of the exact kernel version to locate line number 1057
> > > >
> > > > You can be sure that we will get reports in the future from users of
> > > > heavily modified kernels.
> > > > Having to download a git tree, or apply semi-private patches is a no go.
> > > >
> > > > If you really want to include __FILE__ and __LINE__, these both can be
> > > > stringified and included in the report, with the help of macros.
> > >
> > > I agree the __LINE__ is pointless, but if this has a tracepoint
> > > involved, then you can simply enable the stacktrace trigger to it and
> > > it will save a stack trace in the ring buffer for you.
> > >
> > > echo stacktrace > /sys/kernel/tracing/events/tcp/tcp_drop/trigger
> > >
> > > And when the event triggers it will record a stack trace. You can also
> > > even add a filter to do it only for specific reasons.
> > >
> > > echo 'stacktrace if reason == 1' > /sys/kernel/tracing/events/tcp/tcp_drop/trigger
> > >
> > > And it even works for flags:
> > >
> > > echo 'stacktrace if reason & 0xa' > /sys/kernel/tracing/events/tcp/tcp_drop/trigger
> > >
> > > Which gives another reason to use an enum over a string.
> >
> > You can't do string comparisons? The more string support Ftrace has,
> > the more convenient they will be. Using bpftrace as an example of
> > convenience and showing drop frequency counted by human-readable
> > reason and stack trace:
>
> Yes, you can (and pretty much always had this ability), but having
> flags is usually makes it easier (and faster).
>
> You can have 'stacktrace if reason ~ "*string*"' which will match
> anything with "string" in it.
>
> My main argument against strings is more of the space they take up in
> the ring buffer than the ability to filter.

Understood the concern about size, but it seems the trace includes many things.
Can we have an estimate of the size needed per event ?
If we do not use symbolic, but numbers, I am afraid this trace event
will only be used by a few TCP experts.

+ TP_printk("src=%pISpc dest=%pISpc mark=%#x data_len=%d
snd_nxt=%#x snd_una=%#x \
+ snd_cwnd=%u ssthresh=%u snd_wnd=%u
srtt=%u rcv_wnd=%u \
+ sock_cookie=%llx reason=%d
reason_type=%s reason_line=%d",
__entry->saddr, __entry->daddr, __entry->mark,
__entry->data_len, __entry->snd_nxt,
__entry->snd_una,
__entry->snd_cwnd, __entry->ssthresh,
__entry->snd_wnd,
- __entry->srtt, __entry->rcv_wnd,
__entry->sock_cookie, __entry->reason)
+ __entry->srtt, __entry->rcv_wnd,
__entry->sock_cookie,
+ __entry->reason,
+ __print_symbolic(__entry->reason_code,
TCP_DROP_REASON),
+ __entry->reason_line)
);