Re: [PATCH] net/ipv4: add tracepoint for icmp_send

From: Jason Xing
Date: Tue Feb 27 2024 - 01:39:28 EST


On Tue, Feb 27, 2024 at 1:49 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
>
> On Tue, Feb 27, 2024 at 3:50 AM <xu.xin16@xxxxxxxxxx> wrote:
> >
> > From: xu xin <xu.xin16@xxxxxxxxxx>
> >
> > Introduce a tracepoint for icmp_send, which can help users to get more
> > detail information conveniently when icmp abnormal events happen.
> >
> > 1. Giving an usecase example:
> > =============================
> > When an application experiences packet loss due to an unreachable UDP
> > destination port, the kernel will send an exception message through the
> > icmp_send function. By adding a trace point for icmp_send, developers or
> > system administrators can obtain the detailed information easily about the
> > UDP packet loss, including the type, code, source address, destination
> > address, source port, and destination port. This facilitates the
> > trouble-shooting of packet loss issues especially for those complicated
> > network-service applications.
> >
> > 2. Operation Instructions:
> > ==========================
> > Switch to the tracing directory.
> > cd /sys/kernel/debug/tracing
> > Filter for destination port unreachable.
> > echo "type==3 && code==3" > events/icmp/icmp_send/filter
> > Enable trace event.
> > echo 1 > events/icmp/icmp_send/enable
> >
> > 3. Result View:
> > ================
> > udp_client_erro-11370 [002] ...s.12 124.728002: icmp_send:
> > icmp_send: type=3, code=3.From 127.0.0.1:41895 to 127.0.0.1:6666 ulen=23
> > skbaddr=00000000589b167a
> >
> > Signed-off-by: He Peilin <he.peilin@xxxxxxxxxx>
> > Reviewed-by: xu xin <xu.xin16@xxxxxxxxxx>
> > Reviewed-by: Yunkai Zhang <zhang.yunkai@xxxxxxxxxx>
> > Cc: Yang Yang <yang.yang29@xxxxxxxxxx>
> > Cc: Liu Chun <liu.chun2@xxxxxxxxxx>
> > Cc: Xuexin Jiang <jiang.xuexin@xxxxxxxxxx>
> > ---
> > include/trace/events/icmp.h | 57 +++++++++++++++++++++++++++++++++++++++++++++
> > net/ipv4/icmp.c | 4 ++++
> > 2 files changed, 61 insertions(+)
> > create mode 100644 include/trace/events/icmp.h
> >
> > diff --git a/include/trace/events/icmp.h b/include/trace/events/icmp.h
> > new file mode 100644
> > index 000000000000..3d9af5769bc3
> > --- /dev/null
> > +++ b/include/trace/events/icmp.h
> > @@ -0,0 +1,57 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#undef TRACE_SYSTEM
> > +#define TRACE_SYSTEM icmp
> > +
> > +#if !defined(_TRACE_ICMP_H) || defined(TRACE_HEADER_MULTI_READ)
> > +#define _TRACE_ICMP_H
> > +
> > +#include <linux/icmp.h>
> > +#include <linux/tracepoint.h>
> > +
> > +TRACE_EVENT(icmp_send,
> > +
> > + TP_PROTO(const struct sk_buff *skb, int type, int code),
> > +
> > + TP_ARGS(skb, type, code),
> > +
> > + TP_STRUCT__entry(
> > + __field(__u16, sport)
> > + __field(__u16, dport)
> > + __field(unsigned short, ulen)
> > + __field(const void *, skbaddr)
> > + __field(int, type)
> > + __field(int, code)
> > + __array(__u8, saddr, 4)
> > + __array(__u8, daddr, 4)
> > + ),
> > +
> > + TP_fast_assign(
> > + // Get UDP header
> > + struct udphdr *uh = udp_hdr(skb);
> > + struct iphdr *iph = ip_hdr(skb);
> > + __be32 *p32;
> > +
> > + __entry->sport = ntohs(uh->source);
> > + __entry->dport = ntohs(uh->dest);
> > + __entry->ulen = ntohs(uh->len);
> > + __entry->skbaddr = skb;
> > + __entry->type = type;
> > + __entry->code = code;
> > +
> > + p32 = (__be32 *) __entry->saddr;
> > + *p32 = iph->saddr;
> > +
> > + p32 = (__be32 *) __entry->daddr;
> > + *p32 = iph->daddr;
> > + ),
> > +
>
> FYI, ICMP can be generated for many other protocols than UDP.
>
> > + TP_printk("icmp_send: type=%d, code=%d. From %pI4:%u to %pI4:%u ulen=%d skbaddr=%p",
> > + __entry->type, __entry->code,
> > + __entry->saddr, __entry->sport, __entry->daddr,
> > + __entry->dport, __entry->ulen, __entry->skbaddr)
> > +);
> > +
> > +#endif /* _TRACE_ICMP_H */
> > +
> > +/* This part must be outside protection */
> > +#include <trace/define_trace.h>
> > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
> > index e63a3bf99617..437bdb7e2650 100644
> > --- a/net/ipv4/icmp.c
> > +++ b/net/ipv4/icmp.c
> > @@ -92,6 +92,8 @@
> > #include <net/inet_common.h>
> > #include <net/ip_fib.h>
> > #include <net/l3mdev.h>
> > +#define CREATE_TRACE_POINTS
> > +#include <trace/events/icmp.h>
> >
> > /*
> > * Build xmit assembly blocks
> > @@ -599,6 +601,8 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
> > struct net *net;
> > struct sock *sk;
> >
> > + trace_icmp_send(skb_in, type, code);
>
> I think you missed many sanity checks between lines 622 and 676
[...]
>
> Honestly, a kprobe BPF based solution would be less risky, and less
> maintenance for us.
>

I agreed. I wonder if we can remove some trace_* at the very beginning
of its caller function since they can be easily replaced with bpf
tools and then we make less effort to maintain them, say,
trace_tcp_probe(), trace_tcp_rcv_space_adjust, etc.

Thanks,
Jason