Re: [PATCH v4 net-next 2/3] net/udp: Add 4-tuple hash list basis
From: Philo Lu
Date: Wed Oct 16 2024 - 02:31:12 EST
On 2024/10/14 18:07, Paolo Abeni wrote:
Hi,
On 10/12/24 03:29, Philo Lu wrote:
@@ -3480,13 +3486,14 @@ static struct udp_table __net_init
*udp_pernet_table_alloc(unsigned int hash_ent
if (!udptable)
goto out;
- slot_size = sizeof(struct udp_hslot) + sizeof(struct
udp_hslot_main);
+ slot_size = 2 * sizeof(struct udp_hslot) + sizeof(struct
udp_hslot_main);
udptable->hash = vmalloc_huge(hash_entries * slot_size,
GFP_KERNEL_ACCOUNT);
I'm sorry for the late feedback.
I think it would be better to make the hash4 infra a no op (no lookup,
no additional memory used) for CONFIG_BASE_SMALL=y builds.
Got it. There are 2 affected structs, udp_hslot and udp_sock. They (as
well as related helpers like udp4_hash4) can be wrapped with
CONFIG_BASE_SMALL, and then we can enable BASE_SMALL to eliminate
additional overhead of hash4.
```
+struct udp_hslot_main {
+ struct udp_hslot hslot; /* must be the first member */
+#if !IS_ENABLED(CONFIG_BASE_SMALL)
+ u32 hash4_cnt;
+#endif
+} __aligned(2 * sizeof(long));
@@ -56,6 +56,12 @@ struct udp_sock {
int pending; /* Any pending frames ? */
__u8 encap_type; /* Is this an Encapsulation socket? */
+#if !IS_ENABLED(CONFIG_BASE_SMALL)
+ /* For UDP 4-tuple hash */
+ __u16 udp_lrpa_hash;
+ struct hlist_node udp_lrpa_node;
+#endif
+
```
It would be great if you could please share some benchmark showing the
raw max receive PPS performances for unconnected sockets, with and
without this series applied, to ensure this does not cause any real
regression for such workloads.
Tested using sockperf tp with default msgsize (14B), 3 times for w/ and
w/o the patch set, and results show no obvious difference:
[msg/sec] test1 test2 test3 mean
w/o patch 514,664 519,040 527,115 520.3k
w/ patch 516,863 526,337 527,195 523.5k (+0.6%)
Thank you for review, Paolo.
--
Philo