Re: [RFC v3 net-next 1/7] net: add rx_sk to trace_kfree_skb

From: Steven Rostedt
Date: Wed Jun 05 2024 - 19:57:59 EST


On Tue, 4 Jun 2024 14:47:38 -0700
Yan Zhai <yan@xxxxxxxxxxxxxx> wrote:

> skb does not include enough information to find out receiving
> sockets/services and netns/containers on packet drops. In theory
> skb->dev tells about netns, but it can get cleared/reused, e.g. by TCP
> stack for OOO packet lookup. Similarly, skb->sk often identifies a local
> sender, and tells nothing about a receiver.
>
> Allow passing an extra receiving socket to the tracepoint to improve
> the visibility on receiving drops.
>
> Signed-off-by: Yan Zhai <yan@xxxxxxxxxxxxxx>
> ---
> v2->v3: fixed drop_monitor function prototype
> ---
> include/trace/events/skb.h | 11 +++++++----
> net/core/dev.c | 2 +-
> net/core/drop_monitor.c | 9 ++++++---
> net/core/skbuff.c | 2 +-
> 4 files changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
> index 07e0715628ec..aa6b46b6172c 100644
> --- a/include/trace/events/skb.h
> +++ b/include/trace/events/skb.h
> @@ -24,15 +24,16 @@ DEFINE_DROP_REASON(FN, FN)
> TRACE_EVENT(kfree_skb,
>
> TP_PROTO(struct sk_buff *skb, void *location,
> - enum skb_drop_reason reason),
> + enum skb_drop_reason reason, struct sock *rx_sk),
>
> - TP_ARGS(skb, location, reason),
> + TP_ARGS(skb, location, reason, rx_sk),
>
> TP_STRUCT__entry(
> __field(void *, skbaddr)
> __field(void *, location)
> __field(unsigned short, protocol)
> __field(enum skb_drop_reason, reason)
> + __field(void *, rx_skaddr)

Please add the pointer after the other pointers:

__field(void *, skbaddr)
__field(void *, location)
+ __field(void *, rx_skaddr)
__field(unsigned short, protocol)
__field(enum skb_drop_reason, reason)

otherwise you are adding holes in the ring buffer event.

The TP_STRUCT__entry() is a structure that is saved in the ring buffer. We
want to avoid alignment holes. I also question having a short before the
enum, if the emum is 4 bytes. The short should be at the end.

In fact, looking at the format file, there is a 2 byte hole:

# cat /sys/kernel/tracing/events/skb/kfree_skb/format

name: kfree_skb
ID: 1799
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;

field:void * skbaddr; offset:8; size:8; signed:0;
field:void * location; offset:16; size:8; signed:0;
field:unsigned short protocol; offset:24; size:2; signed:0;
field:enum skb_drop_reason reason; offset:28; size:4; signed:0;

Notice that "protocol" is 2 bytes in size at offset 24, but "reason" starts
at offset 28. This means at offset 26, there's a 2 byte hole.

-- Steve



> ),
>
> TP_fast_assign(
> @@ -40,12 +41,14 @@ TRACE_EVENT(kfree_skb,
> __entry->location = location;
> __entry->protocol = ntohs(skb->protocol);
> __entry->reason = reason;
> + __entry->rx_skaddr = rx_sk;
> ),
>