Re: [PATCH net-next v3 1/5] net: phy: microchip_ptp : Add header file for Microchip ptp library

From: Vadim Fedorenko
Date: Tue Nov 12 2024 - 17:56:42 EST


On 12/11/2024 22:26, Andrew Lunn wrote:
I believe, the current design of mchp_ptp_clock has some issues:

struct mchp_ptp_clock {
struct mii_timestamper mii_ts; /* 0 48 */
struct phy_device * phydev; /* 48 8 */
struct sk_buff_head tx_queue; /* 56 24 */
/* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
struct sk_buff_head rx_queue; /* 80 24 */
struct list_head rx_ts_list; /* 104 16 */
spinlock_t rx_ts_lock /* 120 4 */
int hwts_tx_type; /* 124 4 */
/* --- cacheline 2 boundary (128 bytes) --- */
enum hwtstamp_rx_filters rx_filter; /* 128 4 */
int layer; /* 132 4 */
int version; /* 136 4 */

/* XXX 4 bytes hole, try to pack */

struct ptp_clock * ptp_clock; /* 144 8 */
struct ptp_clock_info caps; /* 152 184 */
/* --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- */
struct mutex ptp_lock; /* 336 32 */
u16 port_base_addr; /* 368 2 */
u16 clk_base_addr; /* 370 2 */
u8 mmd; /* 372 1 */

/* size: 376, cachelines: 6, members: 16 */
/* sum members: 369, holes: 1, sum holes: 4 */
/* padding: 3 */
/* last cacheline: 56 bytes */
};

tx_queue will be splitted across 2 cache lines and will have spinlock on the
cache line next to `struct sk_buff * next`. That means 2 cachelines
will have to fetched to have an access to it - may lead to performance
issues.

Another issue is that locks in tx_queue and rx_queue, and rx_ts_lock
share the same cache line which, again, can have performance issues on
systems which can potentially have several rx/tx queues/irqs.

It would be great to try to reorder the struct a bit.

Dumb question: How much of this is in the hot patch? If this is only
used for a couple of PTP packets per second, do we care about a couple
of cache misses per second? Or will every single packet the PHY
processes be affected by this?

Even with PTP packets timestamped only - imagine someone trying to run
PTP server part with some proper amount of clients? And it's valid to
configure more than 1 sync packet per second. It may become quite hot.