RE: [EXTERNAL] Re: [PATCH net-next 9/9] net: atlantic: add PTP support for AQC113 (Antigua) (Antigua)

From: Sukhdeep Soni [C]

Date: Tue May 19 2026 - 00:04:48 EST


On 12 May 2026, Paolo Abeni wrote:
> ---
> net: atlantic: add PTP support for AQC113 (Antigua) (Antigua)
>
> This patch adds IEEE 1588 PTP support for the AQC113 (Antigua) controller
> alongside the existing AQC107 implementation, covering dual TSG clocks,
> descriptor-writeback TX timestamps, per-instance PTP offsets, L3/L4
> multicast filters, and direct hardware clock control.
>
> A few observations and questions below.
>
> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_main.c b/drivers/net/ethernet/aquantia/atlantic/aq_main.c
> index 4ef4fe64b8ac..aadf3f7f40d0 100644
> --- a/drivers/net/ethernet/aquantia/atlantic/aq_main.c
> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_main.c
> @@ -113,16 +110,25 @@ static netdev_tx_t aq_ndev_start_xmit(struct sk_buff *skb, struct net_device *nd
> * and hardware PTP design of the chip. Otherwise ptp stream
> * will fail to sync
> */
> - if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) ||
> - unlikely((ip_hdr(skb)->version == 4) &&
> - (ip_hdr(skb)->protocol == IPPROTO_UDP) &&
> - ((udp_hdr(skb)->dest == htons(319)) ||
> - (udp_hdr(skb)->dest == htons(320)))) ||
> - unlikely(eth_hdr(skb)->h_proto == htons(ETH_P_1588)))
> + if (unlikely(skb->protocol == htons(ETH_P_IP) &&
> + ip_hdr(skb)->protocol == IPPROTO_UDP &&
> + (udp_hdr(skb)->dest == htons(PTP_EV_PORT) ||
> + udp_hdr(skb)->dest == htons(PTP_GEN_PORT))))
> + return aq_ptp_xmit(aq_nic, skb);
> +
> + /* PTP over IPv6 does not use extension headers */
> + if (unlikely(skb->protocol == htons(ETH_P_IPV6) &&
> + ipv6_hdr(skb)->nexthdr == IPPROTO_UDP &&
> + (udp_hdr(skb)->dest == htons(PTP_EV_PORT) ||
> + udp_hdr(skb)->dest == htons(PTP_GEN_PORT))))
> + return aq_ptp_xmit(aq_nic, skb);
> +
> + if (unlikely(eth_hdr(skb)->h_proto == htons(ETH_P_1588)))
> return aq_ptp_xmit(aq_nic, skb);
> }
> #endif
>
> + skb_tx_timestamp(skb);
> return aq_nic_xmit(aq_nic, skb);
> }
>
>The commit message states:
>
> Move skb_tx_timestamp() to non-PTP path to avoid double timestamps
>
> Was there actually a pre-existing double-timestamp? The prior code had
> exactly one skb_tx_timestamp() per packet (aq_nic_xmit() for non-PTP,
> aq_ptp_xmit() for PTP), so this move appears functionally equivalent for
> the PTP case.
>
> However, the new placement in aq_ndev_start_xmit() is before
> aq_nic_xmit()'s __netif_subqueue_stopped()/map-failure/NETDEV_TX_BUSY
> handling, so software TX timestamps may now be generated for frames that
> end up returning NETDEV_TX_BUSY. Is that intentional? It changes a
> user-visible SO_TIMESTAMPING point for all atlantic chips, not just
> AQC113.

The skb_tx_timestamp() move has been dropped in v3. It remains in
aq_nic_xmit() after all NETDEV_TX_BUSY checks, so software timestamps
are only generated for frames that actually proceed to transmission.
The commit message has been updated to remove that bullet point.

> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> index 3cec853e9fad..63a4987a60de 100644
> --- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> @@ -172,7 +180,11 @@ static int aq_nic_update_link_status(struct aq_nic_s *self)
> aq_nic_update_interrupt_moderation_settings(self);
>
> if (self->aq_ptp) {
> - aq_ptp_clock_init(self);
> + /* PTP does not work in some modes even if physical link is up */
> + bool ptp_link_good = (self->aq_hw->aq_link_status.mbps >= 100 &&
> + self->aq_hw->aq_link_status.full_duplex);
> +
> + aq_ptp_clock_init(self, ptp_link_good ? AQ_PTP_LINK_UP : AQ_PTP_NO_LINK);
> aq_ptp_tm_offset_set(self,
> self->aq_hw->aq_link_status.mbps);
> aq_ptp_link_change(self);
>
> aq_ptp_clock_init() now calls aq_ptp_dpath_enable(), which writes
> hw_filter_l3l4_set/hw_filter_l2_set and mutates aq_ptp->udp_filter[i].cmd
> and aq_ptp->eth_type_filter.
>
> The caller of aq_nic_update_link_status() is the service task and
> aq_linkstate_threaded_isr, neither of which hold rtnl_lock. The
> hwtstamp ioctl path enters aq_ptp_hwtstamp_config_set() (and thence
> aq_ptp_dpath_enable()) under rtnl_lock. Is there a lock covering both
> sides? The same filter structures and registers can be programmed
> concurrently from the two paths, and hwtstamp_config.rx_filter is read
> unsynchronized from the link path while written by the ioctl path.

Ack, Both paths are serialized in v3 via a dedicated ptp_filter_lock mutex
on aq_ptp. Both aq_ptp_clock_init() (link/service task) and
aq_ptp_hwtstamp_config_set() (ioctl path) acquire this mutex before
calling aq_ptp_dpath_enable(), preventing concurrent filter
programming.

> [ ... ]
>
> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ptp.c b/drivers/net/ethernet/aquantia/atlantic/aq_ptp.c
> index 7486a28d7ff8..781d865e1127 100644
> --- a/drivers/net/ethernet/aquantia/atlantic/aq_ptp.c
> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_ptp.c
>
> [ ... ]
>
> +static int aq_ptp_dpath_enable(struct aq_ptp_s *aq_ptp,
> + int enable_flags, u16 rx_queue)
> +{
> [ ... ]
> + if (enable_flags & AQ_HW_PTP_L2_ENABLE) {
> + aq_ptp->eth_type_filter.ethertype = ETH_P_1588;
> + aq_ptp->eth_type_filter.queue = rx_queue;
> + }
> +
> + if (hw_ops->hw_filter_l3l4_set) {
> + for (i = 0; i < flt_idx; i++) {
> + err = hw_ops->hw_filter_l3l4_set(aq_nic->aq_hw,
> + &aq_ptp->udp_filter[i]);
> [ ... ]
> + }
> +
> + if (!err && hw_ops->hw_filter_l2_set) {
> + err = hw_ops->hw_filter_l2_set(aq_nic->aq_hw,
> + &aq_ptp->eth_type_filter);
>
> When enable_flags is AQ_HW_PTP_L4_ENABLE only (for example
> HWTSTAMP_FILTER_PTP_V1_L4_* or HWTSTAMP_FILTER_PTP_V2_L4_*), the block
> that sets eth_type_filter.ethertype/queue is skipped, but
> hw_filter_l2_set is still called unconditionally. Will that install a
> stale L2 filter (zero ethertype from kzalloc, or leftover ETH_P_1588
> from a previous call) into hardware?

Fixed in v3. The hw_filter_l2_set call is now gated by
(enable_flags & AQ_HW_PTP_L2_ENABLE), so it is only invoked when the
L2 filter is actually requested. A pure AQ_HW_PTP_L4_ENABLE path
skips both the eth_type_filter setup and the hw_filter_l2_set call
entirely — no stale or zero ethertype filter is installed.

>
> The pre-patch aq_ptp_prepare_filters() populated both filter
> descriptions together, so this is a new behavior.
>
> + } else {
> + /* PTP disabled, clear all UDP/L2 filters */
> + for (i = 0; i < PTP_UDP_FILTERS_CNT; i++) {
> + aq_ptp->udp_filter[i].cmd &=
> + ~HW_ATL_RX_ENABLE_FLTR_L3L4;
> + if (hw_ops->hw_filter_l3l4_set) {
> + err = hw_ops->hw_filter_l3l4_set(aq_nic->aq_hw,
> + &aq_ptp->udp_filter[i]);
>
> On the a1_ptp path, only udp_filter[0] is ever populated with real cmd
> content; slots 1..3 have only their location field set. This disable
> branch now pushes all four slots through hw_filter_l3l4_set on every
> hwtstamp-disable. Is programming the extra three zero-cmd filter
> entries intended for AQC107?

On a1_ptp, slots 1–3 have cmd = 0 with only their location set;
writing them with HW_ATL_RX_ENABLE_FLTR_L3L4 cleared is
harmless — it simply confirms already-disabled filter slots.

> [ ... ]
>
> @@ -1035,46 +1246,49 @@ static struct ptp_clock_info aq_ptp_clock = {
[ ... ]
> +int aq_ptp_init(struct aq_nic_s *aq_nic, unsigned int idx_ptp_vec)
> {
> bool a1_ptp = ATL_HW_IS_CHIP_FEATURE(aq_nic->aq_hw, ATLANTIC);
> + bool a2_ptp = ATL_HW_IS_CHIP_FEATURE(aq_nic->aq_hw, ANTIGUA);
[ ... ]
> - hw_atl_utils_mpi_read_stats(aq_nic->aq_hw, &mbox);
> -
> - if (!(mbox.info.caps_ex & BIT(CAPS_EX_PHY_PTP_EN))) {
> + if (a1_ptp) {
> + hw_atl_utils_mpi_read_stats(aq_nic->aq_hw, &mbox);
> + if (!(mbox.info.caps_ex & BIT(CAPS_EX_PHY_PTP_EN))) {
> + aq_nic->aq_ptp = NULL;
> + return 0;
> + }
> + } else {
> + memset(&mbox, 0, sizeof(mbox));
> +
> + if (a2_ptp) {
> + mbox.info.ptp_offset.ingress_100 = HW_ATL2_PTP_OFFSET_INGRESS_100;
> [ ... ]
> + }
> + }
>
> The A1 path gates PTP on caps_ex & BIT(CAPS_EX_PHY_PTP_EN), but the A2
> path enables PTP unconditionally on any ATL_HW_IS_CHIP_FEATURE(...,
> ANTIGUA) chip. The commit message only mentions AQC113, but AQC115c
> and AQC116c also match ANTIGUA. Is there a capability bit that should
> gate this for the non-AQC113 Antigua variants?
>

The existing caps structures for these variants don't define
any PTP-specific flag. Adding per-variant PTP capability gating
is outside the scope of this series. If AQC115C/116C don't support
PTP hardware, that can be addressed in a follow-up by adding a
ptp_supported cap flag.

> [ ... ]
>
> /* enable ptp counter */
> + aq_ptp->ptp_clock_sel = ATL_TSG_CLOCK_SEL_0;
> aq_utils_obj_set(&aq_nic->aq_hw->flags, AQ_HW_PTP_AVAILABLE);
> - mutex_lock(&aq_nic->fwreq_mutex);
> - aq_nic->aq_fw_ops->enable_ptp(aq_nic->aq_hw, 1);
> - aq_ptp_clock_init(aq_nic);
> - mutex_unlock(&aq_nic->fwreq_mutex);
> + if (a1_ptp) {
> + mutex_lock(&aq_nic->fwreq_mutex);
> + aq_nic->aq_fw_ops->enable_ptp(aq_nic->aq_hw, 1);
> + mutex_unlock(&aq_nic->fwreq_mutex);
> + }
> + if (a2_ptp)
> + aq_nic->aq_hw_ops->enable_ptp(aq_nic->aq_hw, aq_ptp->ptp_clock_sel, 1);
>
> The A1 enable_ptp call is wrapped in fwreq_mutex, but the A2
> enable_ptp (which performs hw_atl2_tsg_clock_reset / _en /
> hw_atl2_adj_clock_freq / hw_atl2_tpb_tps_highest_priority_tc_enable_set)
> is not. aq_ptp_hw_pin_conf() holds fwreq_mutex when programming the
> same TSG/GPIO block via hw_gpio_pulse(). hw_atl2_hw_reset() also calls
> hw_atl2_enable_ptp() without fwreq_mutex. Should the A2 enable_ptp
> paths also take fwreq_mutex to avoid concurrent writes to the TSG
> block?

fwreq_mutex guards firmware mailbox requests (aq_fw_ops calls), not
MMIO writes. The A2 enable_ptp does direct TSG register writes and
does not need the mutex — not a concern.

> [ ... ]
>
> @@ -1244,30 +1514,45 @@ void aq_ptp_unregister(struct aq_nic_s *aq_nic)
> [ ... ]
> void aq_ptp_free(struct aq_nic_s *aq_nic)
> {
> struct aq_ptp_s *aq_ptp = aq_nic->aq_ptp;
> + int i;
>
> if (!aq_ptp)
> return;
>
> + /* disable ptp */
> + if (aq_ptp->a1_ptp) {
> + mutex_lock(&aq_nic->fwreq_mutex);
> + aq_nic->aq_fw_ops->enable_ptp(aq_nic->aq_hw, 0);
> + mutex_unlock(&aq_nic->fwreq_mutex);
> + }
> +
> + if (aq_ptp->a2_ptp)
> + aq_nic->aq_hw_ops->enable_ptp(aq_nic->aq_hw,
> + aq_ptp->ptp_clock_sel, 0);
> +
> + cancel_delayed_work_sync(&aq_ptp->poll_sync);
> +
> aq_nic_release_filter(aq_nic, aq_rx_filter_ethertype,
> aq_ptp->eth_type_filter.location);
> [ ... ]
>
> The pre-patch code cancelled poll_sync first, then disabled PTP. The
> new order disables PTP before cancel_delayed_work_sync(&aq_ptp->poll_sync).
>
> A poll_sync worker running concurrently will call
>aq_ptp_check_sync1588() / hw_ts_to_sys_clock() against TSG clocks that
> enable_ptp(..., 0) has just torn down. For A2 the same worker may race
> with hw_atl2_tsg_clock_en register writes.
>
> Is there a reason to invert this ordering? The standard pattern is
> cancel_delayed_work_sync() first, then tear down the resources the
> worker uses.
>

Fixed in v3: cancel_delayed_work_sync(&aq_ptp->poll_sync) is called
before enable_ptp(..., 0), ensuring the worker is fully stopped
before TSG resources are torn down.

> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
> index e270327e47fd..a52d6d3fe464 100644
> --- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
> @@ -308,24 +308,30 @@ bool aq_ring_tx_clean(struct aq_ring_s *self)
> }
> }
>
> - if (likely(!buff->is_eop))
> - goto out;
> [ ... ]
> + if (unlikely(buff->is_eop)) {
> + if (unlikely(buff->request_ts) &&
> + self->aq_nic->aq_hw_ops->hw_ring_tx_ptp_get_ts) {
> + u64 ts = self->aq_nic->aq_hw_ops->hw_ring_tx_ptp_get_ts(self);
> +
> + if (!ts)
> + break;
>
> -out:
> + aq_ptp_tx_hwtstamp(self->aq_nic, ts);
> + }
> + if (buff->skb) {
> [ ... ]
> + }
> + }
> buff->skb = NULL;
> buff->xdpf = NULL;
> buff->pa = 0U;
>
> When hw_ring_tx_ptp_get_ts() returns 0 (TX timestamp not yet ready),
> the loop does break. At that point dma_unmap_single()/dma_unmap_page()
> has already run earlier in the iteration, but buff->is_mapped remains
> 1, buff->pa still holds the original DMA address, buff->skb is still
> set, and sw_head is not advanced.
>
> On the next aq_ring_tx_clean() call the same descriptor is revisited
> and dma_unmap is called a second time on the already-unmapped address.
> Under CONFIG_DMA_API_DEBUG this triggers a WARN; under an IOMMU it may
> corrupt mappings.
>
> The commit message also says:
>
> Poll for TX timestamp completion in aq_ring_tx_clean() with a
> timeout mechanism (aq_ptp_tx_ts_timedout/clear)
>
> but no aq_ptp_tx_ts_timedout or aq_ptp_tx_ts_clear helper appears in
> this patch. Where is the timeout that bounds this polling loop?
> If the HW never writes back DD|TS_VALID (silicon quirk, link flap, lost
> packet), does the TX ring stall indefinitely and leak the skb and DMA
> mapping?

Both points are addressed in v3:

On the double-unmap: when hw_ring_tx_ptp_get_ts() returns 0 and the
timeout has not expired, the code now explicitly clears buff->is_mapped
and buff->pa before breaking, so the next iteration sees
is_mapped == 0 and skips dma_unmap. No double-unmap occurs.

On the timeout: a ptp_ts_deadline field is checked via
time_after(jiffies, self->ptp_ts_deadline). On expiry, request_ts
is cleared and the skb is freed via dev_kfree_skb_any(), unblocking
the ring. The commit message reference to aq_ptp_tx_ts_timedout/clear
was from an earlier draft and will be corrected in v3.

> diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c
> index c71e8d1adfc9..3047bda619c0 100644
> --- a/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c
> +++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl2/hw_atl2.c
> @@ -719,14 +735,24 @@ static int hw_atl2_hw_ring_rx_init(struct aq_hw_s *self,
> struct aq_ring_s *aq_ring,
> struct aq_ring_param_s *aq_ring_param)
> {
> - return hw_atl_b0_hw_ring_rx_init(self, aq_ring, aq_ring_param);
> + int res = hw_atl_b0_hw_ring_rx_init(self, aq_ring, aq_ring_param);
> +
> + if (aq_ptp_ring(aq_ring->aq_nic, aq_ring))
> + hw_atl2_hw_rx_ptp_ring_init(self, aq_ring);
> +
> + return res;
> }
>
> static int hw_atl2_hw_ring_tx_init(struct aq_hw_s *self,
> struct aq_ring_s *aq_ring,
> struct aq_ring_param_s *aq_ring_param)
> {
> - return hw_atl_b0_hw_ring_tx_init(self, aq_ring, aq_ring_param);
> + int res = hw_atl_b0_hw_ring_tx_init(self, aq_ring, aq_ring_param);
> +
> + if (aq_ptp_ring(aq_ring->aq_nic, aq_ring))
> + hw_atl2_hw_tx_ptp_ring_init(self, aq_ring);
> +
> + return res;
> }
>
> hw_atl2_hw_{rx,tx}_ptp_ring_init() return aq_hw_err_from_flags(self),
> which reports -ENXIO/-EIO when AQ_HW_FLAG_ERR_UNPLUG or
> AQ_HW_FLAG_ERR_HW is set. The return value is discarded here, so PTP
> ring setup errors are hidden from the caller. Should this be something
> like:
>
> return res ? res : aq_hw_err_from_flags(self);

Fixed in v3. The PTP ring init return values are captured in rx_err/
tx_err and propagated to the caller — PTP ring setup errors are no
longer silently discarded. The base init (hw_atl_b0_hw_ring_*_init)
is also only attempted when it succeeds (if (!res && ...)).

>
> [ ... ]
>
> +static u64 hw_atl2_hw_ring_tx_ptp_get_ts(struct aq_ring_s *ring)
> +{
> + struct hw_atl2_txts_s *txts;
> +
> + txts = (struct hw_atl2_txts_s *)&ring->dx_ring[ring->sw_head *
> + HW_ATL2_TXD_SIZE];
> + /* DD + TS_VALID */
> + if ((txts->ctrl & HW_ATL2_TXTS_DD) && (txts->ctrl & HW_ATL2_TXTS_TS_VALID))
> + return txts->ts;
> +
> + return 0;
> +}
>
> txts lives in DMA-coherent memory written by the device. The compiler
> can issue two separate loads of txts->ctrl between the two bitmask
> checks, producing a torn state (DD set, TS_VALID appears unset, or vice
> versa). On weakly-ordered architectures (arm64, powerpc) the CPU can
> also speculate txts->ts ahead of the ctrl check.
>
> Should this use a READ_ONCE on ctrl followed by dma_rmb() before
> reading txts->ts?

Fixed in v3: ctrl is read once via READ_ONCE(txts->ctrl) and both
bit checks are made against the single cached value. A dma_rmb() is
inserted before reading txts->ts to prevent the CPU from speculating
the timestamp load ahead of the control field check on weakly-ordered
architectures.

> +static u16 hw_atl2_hw_rx_extract_ts(struct aq_hw_s *self, u8 *p,
> + unsigned int len, u64 *timestamp)
> +{
> + unsigned int offset = HW_ATL2_RX_TS_SIZE;
> + u8 *ptr;
> +
> + if (len <= offset || !timestamp)
> + return 0;
> +
> + ptr = p + (len - offset);
> + memcpy(timestamp, ptr, sizeof(*timestamp));
> +
> + return HW_ATL2_RX_TS_SIZE;
> +}
>
> The data at ptr is written by the device in a defined wire byte order,
> but is being copied into a plain host-order u64 with no byteswap.
> The A1 helper hw_atl_b0_rx_extract_ts() handles this with explicit
> __be64/__be32 locals and be64_to_cpu/be32_to_cpu accessors. Is the AQC113
> field really host-endian, or should big-endian hosts byteswap here as
> well? Also, struct hw_atl2_txts_s declares ts as u64 rather than a
> __bitwise type, which will not be caught by sparse.

Both points are addressed in v3. hw_atl2_hw_rx_extract_ts() now copies
into a __le64 local and converts with le64_to_cpu() before storing
into *timestamp. struct hw_atl2_txts_s declares ts as __le64,
so sparse will flag incorrect accesses, and hw_ring_tx_ptp_get_ts()
reads it with le64_to_cpu(txts->ts).

> [ ... ]
>
> aq_ptp_ring_alloc() error path in aq_ptp.c:
>
> + if (aq_ptp->a1_ptp) {
> + err = aq_ring_hwts_rx_alloc(&aq_ptp->hwts_rx, aq_nic, PTP_HWST_RING_IDX,
> + aq_nic->aq_nic_cfg.rxds,
> + aq_nic->aq_nic_cfg.aq_hw_caps->rxd_size);
> + if (err)
> + goto err_exit_ptp_rx;
> + }
> [ ... ]
> err_exit_hwts_rx:
> - aq_ring_hwts_rx_free(&aq_ptp->hwts_rx);
> + if (aq_ptp->a1_ptp)
> + aq_ring_free(&aq_ptp->hwts_rx);
>
> aq_ring_hwts_rx_alloc() allocates sz = size * dx_size + AQ_CFG_RXDS_DEF
> bytes of DMA-coherent memory, but the err_exit_hwts_rx label now calls
> aq_ring_free() which frees size * dx_size (the AQ_CFG_RXDS_DEF padding
> is dropped). The pre-patch code used aq_ring_hwts_rx_free() which
> matched the allocation size. Should this stay as aq_ring_hwts_rx_free()
> in the error path so dma_free_coherent() sees the original allocation
> size?

Fixed in v3: the err_exit_hwts_rx label uses aq_ring_hwts_rx_free()
which passes size * dx_size + AQ_CFG_RXDS_DEF to
dma_free_coherent(), matching the original allocation size exactly.

> --
> This is an AI-generated review.