Re: [PATCH] net/ipv4: add tracepoint for icmp_send

From: Eric Dumazet
Date: Tue Feb 27 2024 - 00:49:24 EST


On Tue, Feb 27, 2024 at 3:50 AM <xu.xin16@xxxxxxxxxx> wrote:
>
> From: xu xin <xu.xin16@xxxxxxxxxx>
>
> Introduce a tracepoint for icmp_send, which can help users to get more
> detail information conveniently when icmp abnormal events happen.
>
> 1. Giving an usecase example:
> =============================
> When an application experiences packet loss due to an unreachable UDP
> destination port, the kernel will send an exception message through the
> icmp_send function. By adding a trace point for icmp_send, developers or
> system administrators can obtain the detailed information easily about the
> UDP packet loss, including the type, code, source address, destination
> address, source port, and destination port. This facilitates the
> trouble-shooting of packet loss issues especially for those complicated
> network-service applications.
>
> 2. Operation Instructions:
> ==========================
> Switch to the tracing directory.
> cd /sys/kernel/debug/tracing
> Filter for destination port unreachable.
> echo "type==3 && code==3" > events/icmp/icmp_send/filter
> Enable trace event.
> echo 1 > events/icmp/icmp_send/enable
>
> 3. Result View:
> ================
> udp_client_erro-11370 [002] ...s.12 124.728002: icmp_send:
> icmp_send: type=3, code=3.From 127.0.0.1:41895 to 127.0.0.1:6666 ulen=23
> skbaddr=00000000589b167a
>
> Signed-off-by: He Peilin <he.peilin@xxxxxxxxxx>
> Reviewed-by: xu xin <xu.xin16@xxxxxxxxxx>
> Reviewed-by: Yunkai Zhang <zhang.yunkai@xxxxxxxxxx>
> Cc: Yang Yang <yang.yang29@xxxxxxxxxx>
> Cc: Liu Chun <liu.chun2@xxxxxxxxxx>
> Cc: Xuexin Jiang <jiang.xuexin@xxxxxxxxxx>
> ---
> include/trace/events/icmp.h | 57 +++++++++++++++++++++++++++++++++++++++++++++
> net/ipv4/icmp.c | 4 ++++
> 2 files changed, 61 insertions(+)
> create mode 100644 include/trace/events/icmp.h
>
> diff --git a/include/trace/events/icmp.h b/include/trace/events/icmp.h
> new file mode 100644
> index 000000000000..3d9af5769bc3
> --- /dev/null
> +++ b/include/trace/events/icmp.h
> @@ -0,0 +1,57 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM icmp
> +
> +#if !defined(_TRACE_ICMP_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_ICMP_H
> +
> +#include <linux/icmp.h>
> +#include <linux/tracepoint.h>
> +
> +TRACE_EVENT(icmp_send,
> +
> + TP_PROTO(const struct sk_buff *skb, int type, int code),
> +
> + TP_ARGS(skb, type, code),
> +
> + TP_STRUCT__entry(
> + __field(__u16, sport)
> + __field(__u16, dport)
> + __field(unsigned short, ulen)
> + __field(const void *, skbaddr)
> + __field(int, type)
> + __field(int, code)
> + __array(__u8, saddr, 4)
> + __array(__u8, daddr, 4)
> + ),
> +
> + TP_fast_assign(
> + // Get UDP header
> + struct udphdr *uh = udp_hdr(skb);
> + struct iphdr *iph = ip_hdr(skb);
> + __be32 *p32;
> +
> + __entry->sport = ntohs(uh->source);
> + __entry->dport = ntohs(uh->dest);
> + __entry->ulen = ntohs(uh->len);
> + __entry->skbaddr = skb;
> + __entry->type = type;
> + __entry->code = code;
> +
> + p32 = (__be32 *) __entry->saddr;
> + *p32 = iph->saddr;
> +
> + p32 = (__be32 *) __entry->daddr;
> + *p32 = iph->daddr;
> + ),
> +

FYI, ICMP can be generated for many other protocols than UDP.

> + TP_printk("icmp_send: type=%d, code=%d. From %pI4:%u to %pI4:%u ulen=%d skbaddr=%p",
> + __entry->type, __entry->code,
> + __entry->saddr, __entry->sport, __entry->daddr,
> + __entry->dport, __entry->ulen, __entry->skbaddr)
> +);
> +
> +#endif /* _TRACE_ICMP_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
> index e63a3bf99617..437bdb7e2650 100644
> --- a/net/ipv4/icmp.c
> +++ b/net/ipv4/icmp.c
> @@ -92,6 +92,8 @@
> #include <net/inet_common.h>
> #include <net/ip_fib.h>
> #include <net/l3mdev.h>
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/icmp.h>
>
> /*
> * Build xmit assembly blocks
> @@ -599,6 +601,8 @@ void __icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info,
> struct net *net;
> struct sock *sk;
>
> + trace_icmp_send(skb_in, type, code);

I think you missed many sanity checks between lines 622 and 676

Honestly, a kprobe BPF based solution would be less risky, and less
maintenance for us.