RE: [PATCH net-next v2 3/4] octeon_ep: implement xmit_more in transmit

From: Shinas Rasheed
Date: Mon Oct 30 2023 - 10:15:15 EST


Hi,

I understand the window is closed, but just replying to a pending comment on the thread.

> -----Original Message-----
> From: David Laight <David.Laight@xxxxxxxxxx>
> ----------------------------------------------------------------------
> From: Shinas Rasheed
> > Sent: 24 October 2023 15:51
> >
> > Add xmit_more handling in tx datapath for octeon_ep pf.
> >
> ...
> > -
> > - /* Ring Doorbell to notify the NIC there is a new packet */
> > - writel(1, iq->doorbell_reg);
> > - iq->stats.instr_posted++;
> > + /* Ring Doorbell to notify the NIC of new packets */
> > + writel(iq->fill_cnt, iq->doorbell_reg);
> > + iq->stats.instr_posted += iq->fill_cnt;
> > + iq->fill_cnt = 0;
> > return NETDEV_TX_OK;
>
> Does that really need the count?
> A 'doorbell' register usually just tells the MAC engine
> to go and look at the transmit ring.
> It then continues to process transmits until it fails
> to find a packet.
> So if the transmit is active you don't need to set the bit.
> (Although that is actually rather hard to detect.)

The way the octeon hardware works is that it expects number of newly updated packets to be written to the doorbell register,
which effectively increments the doorbell count which shall be decremented by hardware as it reads these packets. So in essence,
the doorbell count also indicates outstanding packets to be read by hardware.

> The 'xmit_more' flag is useful if (the equivalent of) writing
> the doorbell register is expensive since it can be delayed
> to a later frame and only done once - adding a slight latency
> to the earlier transmits if the mac engine was idle.
>
> I'm not sure how much (if any) performance gain you actually
> get from avoiding the writel().
> Single PCIe writes are 'posted' and pretty much completely
> asynchronous.

Can you elaborate what you are suggesting here to do? The driver is trying to make use of the 'xmit_more'
hint from the network stack, as any network driver might opt to do. I think avoiding continuous PCIe posts for each packet
shall still be wasteful as the hardware can bulk read from the queue if we give it a batch of packets.

> The other problem I've seen is that netdev_xmit_more() is
> the state of the queue when the transmit was started, not
> the current state.
> If a packet is added while the earlier transmit setup code
> is running (setting up the descriptors etc) the it isn't set.
> So the fast path doesn't get taken.

By the next packet the kernel sends, the xmit_more should be set
as far I understand, right? (as the xmit_more bool is set if skb->next is present, if the transmit path follows dev_hard_start_xmit).