Re: [PATCH net-next v3 1/7] enic: Create enic_wq/rq structures to bundle per wq/rq data

From: Vadim Fedorenko
Date: Sat Nov 09 2024 - 15:28:13 EST


On 08/11/2024 21:47, Nelson Escobar wrote:
Bundling the wq/rq specific data into dedicated enic_wq/rq structures
cleans up the enic structure and simplifies future changes related to
wq/rq.

Co-developed-by: John Daley <johndale@xxxxxxxxx>
Signed-off-by: John Daley <johndale@xxxxxxxxx>
Co-developed-by: Satish Kharat <satishkh@xxxxxxxxx>
Signed-off-by: Satish Kharat <satishkh@xxxxxxxxx>
Signed-off-by: Nelson Escobar <neescoba@xxxxxxxxx>
Reviewed-by: Simon Horman <horms@xxxxxxxxxx>
---
drivers/net/ethernet/cisco/enic/enic.h | 18 ++--
drivers/net/ethernet/cisco/enic/enic_ethtool.c | 4 +-
drivers/net/ethernet/cisco/enic/enic_main.c | 120 ++++++++++++-------------
drivers/net/ethernet/cisco/enic/enic_res.c | 12 +--
4 files changed, 81 insertions(+), 73 deletions(-)

diff --git a/drivers/net/ethernet/cisco/enic/enic.h b/drivers/net/ethernet/cisco/enic/enic.h
index 0cc3644ee8554f52401a0be7f44a1475ab2ea2b9..e6edb43515b97feeb21a9b55a1eeaa9b9381183f 100644
--- a/drivers/net/ethernet/cisco/enic/enic.h
+++ b/drivers/net/ethernet/cisco/enic/enic.h
@@ -162,6 +162,17 @@ struct enic_rq_stats {
u64 desc_skip; /* Rx pkt went into later buffer */
};
+struct enic_wq {
+ struct vnic_wq vwq;
+ struct enic_wq_stats stats;
+ spinlock_t lock; /* spinlock for wq */
+};
+

This change will make vnic_wq spinlock share the latest cache line of
queue statitics while protecting struct vnic_wq.

struct enic_wq {
struct vnic_wq vwq; /* 0 632 */

/* XXX last struct has 4 bytes of padding, 1 hole */

/* --- cacheline 9 boundary (576 bytes) was 56 bytes ago --- */
struct enic_wq_stats stats; /* 632 120 */
/* --- cacheline 11 boundary (704 bytes) was 48 bytes ago --- */
spinlock_t lock; /* 752 4 */

/* size: 760, cachelines: 12, members: 3 */
/* padding: 4 */
/* member types with holes: 1, total: 1 */
/* paddings: 1, sum paddings: 4 */
/* last cacheline: 56 bytes */
};


That is not good for performance. Simple re-arrange of the struct makes
lock and vnic_wq structure properly aligned:

struct enic_wq {
spinlock_t lock; /* 0 4 */

/* XXX 4 bytes hole, try to pack */

struct vnic_wq vwq; /* 8 632 */

/* XXX last struct has 4 bytes of padding, 1 hole */

/* --- cacheline 10 boundary (640 bytes) --- */
struct enic_wq_stats stats; /* 640 120 */

/* size: 760, cachelines: 12, members: 3 */
/* sum members: 756, holes: 1, sum holes: 4 */
/* member types with holes: 1, total: 1 */
/* paddings: 1, sum paddings: 4 */
/* last cacheline: 56 bytes */
};

Adding one u64 pad field to enic_wq_stats makes whole enic_wq properly
aligned to 12 cache lines on x86.

It might be worth adding ____cacheline_aligned to struct enic_wq to keep
the same performance expectations after converting to dynamic allocation
later in the patchset.

+struct enic_rq {
+ struct vnic_rq vrq;
+ struct enic_rq_stats stats;
+};
+
/* Per-instance private data structure */
struct enic {
struct net_device *netdev;