Re: [RFC PATCH bpf-next v3 2/2] net: Add additional bit to support userspace timestamp type

From: Martin KaFai Lau
Date: Mon Apr 15 2024 - 16:01:06 EST


On 4/13/24 12:07 PM, Willem de Bruijn wrote:
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a83a2120b57f..b6346c21c3d4 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -827,7 +827,8 @@ enum skb_tstamp_type {
* @tstamp_type: When set, skb->tstamp has the
* delivery_time in mono clock base (i.e. EDT). Otherwise, the
* skb->tstamp has the (rcv) timestamp at ingress and
- * delivery_time at egress.
+ * delivery_time at egress or skb->tstamp defined by skb->sk->sk_clockid
+ * coming from userspace
* @napi_id: id of the NAPI struct this skb came from
* @sender_cpu: (aka @napi_id) source CPU in XPS
* @alloc_cpu: CPU which did the skb allocation.
@@ -955,7 +956,7 @@ struct sk_buff {
/* private: */
__u8 __mono_tc_offset[0];
/* public: */
- __u8 tstamp_type:1; /* See SKB_MONO_DELIVERY_TIME_MASK */
+ __u8 tstamp_type:2; /* See SKB_MONO_DELIVERY_TIME_MASK */
#ifdef CONFIG_NET_XGRESS
__u8 tc_at_ingress:1; /* See TC_AT_INGRESS_MASK */
__u8 tc_skip_classify:1;

A quick pahole for a fairly standard .config that I had laying around
shows a hole after this list of bits, so no huge concerns there from
adding a bit:

__u8 slow_gro:1; /* 3: 4 1 */
__u8 csum_not_inet:1; /* 3: 5 1 */

/* XXX 2 bits hole, try to pack */

__u16 tc_index; /* 4 2 */

@@ -1090,10 +1091,10 @@ struct sk_buff {
*/
#ifdef __BIG_ENDIAN_BITFIELD
#define SKB_MONO_DELIVERY_TIME_MASK (1 << 7)
-#define TC_AT_INGRESS_MASK (1 << 6)
+#define TC_AT_INGRESS_MASK (1 << 5)

Have to be careful when adding a new 2 bit tstamp_type with both bits
set, that this does not incorrectly get interpreted as MONO.

I haven't looked closely at the BPF API, but hopefully it can be
extensible to return the specific type. If it is hardcoded to return
either MONO or not, then only 0x1 should match, not 0x3.

Good point. I believe it is the best to have bpf to consider both bits in tstamp_type:2 in filter.c to avoid the 0x3 surprise in the future. The BPF API can be extended to support SKB_CLOCK_TAI.

Regardless, in bpf_convert_tstamp_write(), it still needs to clear both bits in tstamp_type when it is at ingress. Right now it only clears the mono bit.

Then it may as well consider both tstamp_type:2 bits in bpf_convert_tstamp_read() and bpf_convert_tstamp_type_read(). e.g. bpf_convert_tstamp_type_read(), it should be a pretty straight forward change because the SKB_CLOCK_* enum value should be a 1:1 mapping to the BPF_SKB_TSTAMP_*.


#else
#define SKB_MONO_DELIVERY_TIME_MASK (1 << 0)
-#define TC_AT_INGRESS_MASK (1 << 1)
+#define TC_AT_INGRESS_MASK (1 << 2)
#endif
#define SKB_BF_MONO_TC_OFFSET offsetof(struct sk_buff, __mono_tc_offset)