Re: [PATCH net-next v3 3/4] net: lan966x: Add FDMA functionality
From: Jakub Kicinski
Date: Wed Apr 06 2022 - 15:41:46 EST
On Wed, 6 Apr 2022 13:21:15 +0200 Horatiu Vultur wrote:
> > > +static int lan966x_fdma_tx_alloc(struct lan966x_tx *tx)
> > > +{
> > > + struct lan966x *lan966x = tx->lan966x;
> > > + struct lan966x_tx_dcb *dcb;
> > > + struct lan966x_db *db;
> > > + int size;
> > > + int i, j;
> > > +
> > > + tx->dcbs_buf = kcalloc(FDMA_DCB_MAX, sizeof(struct lan966x_tx_dcb_buf),
> > > + GFP_ATOMIC);
> > > + if (!tx->dcbs_buf)
> > > + return -ENOMEM;
> > > +
> > > + /* calculate how many pages are needed to allocate the dcbs */
> > > + size = sizeof(struct lan966x_tx_dcb) * FDMA_DCB_MAX;
> > > + size = ALIGN(size, PAGE_SIZE);
> > > + tx->dcbs = dma_alloc_coherent(lan966x->dev, size, &tx->dma, GFP_ATOMIC);
> >
> > This functions seems to only be called from probe, so GFP_KERNEL
> > is better.
>
> But in the next patch of this series will be called while holding
> the lan966x->tx_lock. Should I still change it to GFP_KERNEL and then
> in the next one will change to GFP_ATOMIC?
Ah, I missed that. You can keep the GFP_ATOMIC then.
But I think the reconfig path may be racy. You disable Rx, but don't
disable napi. NAPI may still be running and doing Rx while you're
trying to free the rx skbs, no?
Once napi is disabled you can disable Tx and then you have full
ownership of the Tx side, no need to hold the lock during
lan966x_fdma_tx_alloc(), I'd think.
> > > +int lan966x_fdma_xmit(struct sk_buff *skb, __be32 *ifh, struct net_device *dev)
> > > +{
> > > + struct lan966x_port *port = netdev_priv(dev);
> > > + struct lan966x *lan966x = port->lan966x;
> > > + struct lan966x_tx_dcb_buf *next_dcb_buf;
> > > + struct lan966x_tx_dcb *next_dcb, *dcb;
> > > + struct lan966x_tx *tx = &lan966x->tx;
> > > + struct lan966x_db *next_db;
> > > + int needed_headroom;
> > > + int needed_tailroom;
> > > + dma_addr_t dma_addr;
> > > + int next_to_use;
> > > + int err;
> > > +
> > > + /* Get next index */
> > > + next_to_use = lan966x_fdma_get_next_dcb(tx);
> > > + if (next_to_use < 0) {
> > > + netif_stop_queue(dev);
> > > + return NETDEV_TX_BUSY;
> > > + }
> > > +
> > > + if (skb_put_padto(skb, ETH_ZLEN)) {
> > > + dev->stats.tx_dropped++;
> > > + return NETDEV_TX_OK;
> > > + }
> > > +
> > > + /* skb processing */
> > > + needed_headroom = max_t(int, IFH_LEN * sizeof(u32) - skb_headroom(skb), 0);
> > > + needed_tailroom = max_t(int, ETH_FCS_LEN - skb_tailroom(skb), 0);
> > > + if (needed_headroom || needed_tailroom || skb_header_cloned(skb)) {
> > > + err = pskb_expand_head(skb, needed_headroom, needed_tailroom,
> > > + GFP_ATOMIC);
> > > + if (unlikely(err)) {
> > > + dev->stats.tx_dropped++;
> > > + err = NETDEV_TX_OK;
> > > + goto release;
> > > + }
> > > + }
> > > +
> > > + skb_tx_timestamp(skb);
> >
> > This could move down after the dma mapping, so it's closer to when
> > the devices gets ownership.
>
> The problem is that, if I move this lower, then the SKB is changed
> because the IFH is added to the frame. So now if we do timestamping in
> the PHY then when we call classify inside 'skb_clone_tx_timestamp'
> will always return PTP_CLASS_NONE so the PHY will never get the frame.
> That is the reason why I have move it back.
Oh, I see, makes sense!