Re: [RFC PATCH bpf-next v1 3/3] net: Add additional bit to support userspace timestamp type

From: Willem de Bruijn
Date: Wed Apr 10 2024 - 11:42:36 EST


Abhishek Chauhan wrote:
> tstamp_type can be real, mono or userspace timestamp.
>
> This commit adds userspace timestamp and sets it if there is
> valid transmit_time available in socket coming from userspace.
>
> To make the design scalable for future needs this commit bring in
> the change to extend the tstamp_type:1 to tstamp_type:2 to support
> userspace timestamp.
>
> Link: https://lore.kernel.org/netdev/bc037db4-58bb-4861-ac31-a361a93841d3@xxxxxxxxx/
> Signed-off-by: Abhishek Chauhan <quic_abchauha@xxxxxxxxxxx>
> ---
> include/linux/skbuff.h | 19 +++++++++++++++++--
> net/ipv4/ip_output.c | 2 +-
> net/ipv4/raw.c | 2 +-
> net/ipv6/ip6_output.c | 2 +-
> net/ipv6/raw.c | 2 +-
> net/packet/af_packet.c | 6 +++---
> 6 files changed, 24 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 6160185f0fe0..2f91a8a2157a 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -705,6 +705,9 @@ typedef unsigned char *sk_buff_data_t;
> enum skb_tstamp_type {
> SKB_TSTAMP_TYPE_RX_REAL = 0, /* A RX (receive) time in real */
> SKB_TSTAMP_TYPE_TX_MONO = 1, /* A TX (delivery) time in mono */
> + SKB_TSTAMP_TYPE_TX_USER = 2, /* A TX (delivery) time and its clock
> + * is in skb->sk->sk_clockid.
> + */

Weird indentation?

More fundamentally: instead of defining a type TX_USER, can we use a
real clockid (e.g., CLOCK_TAI) based on skb->sk->sk_clockid? Rather
than store an id that means "go look at sk_clockid".

> };
>
> /**
> @@ -830,6 +833,9 @@ enum skb_tstamp_type {
> * delivery_time in mono clock base (i.e. EDT). Otherwise, the
> * skb->tstamp has the (rcv) timestamp at ingress and
> * delivery_time at egress.
> + * delivery_time in mono clock base (i.e., EDT) or a clock base chosen
> + * by SO_TXTIME. If zero, skb->tstamp has the (rcv) timestamp at
> + * ingress.
> * @napi_id: id of the NAPI struct this skb came from
> * @sender_cpu: (aka @napi_id) source CPU in XPS
> * @alloc_cpu: CPU which did the skb allocation.
> @@ -960,7 +966,7 @@ struct sk_buff {
> /* private: */
> __u8 __mono_tc_offset[0];
> /* public: */
> - __u8 tstamp_type:1; /* See SKB_MONO_DELIVERY_TIME_MASK */
> + __u8 tstamp_type:2; /* See SKB_MONO_DELIVERY_TIME_MASK */
> #ifdef CONFIG_NET_XGRESS
> __u8 tc_at_ingress:1; /* See TC_AT_INGRESS_MASK */
> __u8 tc_skip_classify:1;

With pahole, does this have an effect on sk_buff layout?

> @@ -4274,7 +4280,16 @@ static inline void skb_set_delivery_time(struct sk_buff *skb, ktime_t kt,
> enum skb_tstamp_type tstamp_type)
> {
> skb->tstamp = kt;
> - skb->tstamp_type = kt && tstamp_type;
> +
> + if (skb->tstamp_type)
> + return;
> +

Why bail if a type is already set? And what if
skb->tstamp_type != tstamp_type? Should skb->tstamp then not be
written to (i.e., the test moved up), and perhaps a rate limited
warning.

> + if (kt && tstamp_type == SKB_TSTAMP_TYPE_TX_MONO)
> + skb->tstamp_type = SKB_TSTAMP_TYPE_TX_MONO;
> +
> + if (kt && tstamp_type == SKB_TSTAMP_TYPE_TX_USER)
> + skb->tstamp_type = SKB_TSTAMP_TYPE_TX_USER;

Simpler

if (kt)
skb->tstamp_type = tstamp_type;