Re: [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback
From: Andrew Lunn
Date: Thu Nov 05 2020 - 08:45:20 EST
> > Where is the TX confirm which uses this stored pointer. I don't see it
> > in this file.
> >
>
> The Tx confirm - dpaa2_switch_tx_conf() - is added in patch 5/9.
Not so obvious. Could it be moved here?
> > It can be expensive to store pointer like this in buffers used for
> > DMA.
>
> Yes, it is. But the hardware does not give us any other indication that
> a packet was actually sent so that we can move ahead with consuming the
> initial skb.
>
> > It has to be flushed out of the cache here as part of the
> > send. Then the TX complete needs to invalidate and then read it back
> > into the cache. Or you use coherent memory which is just slow.
> >
> > It can be cheaper to keep a parallel ring in cacheable memory which
> > never gets flushed.
>
> I'm afraid I don't really understand your suggestion. In this parallel
> ring I would keep the skb pointers of all frames which are in-flight?
> Then, when a packet is received on the Tx confirmation queue I would
> have to loop over the parallel ring and determine somehow which skb was
> this packet initially associated to. Isn't this even more expensive?
I don't know this particular hardware, so i will talk in general
terms. Generally, you have a transmit ring. You add new frames to be
sent to the beginning of the ring, and you take off completed frames
from the end of the ring. This is kept in 'expensive' memory, in that
either it is coherent, or you need to do flushed/invalidates.
It is expected that the hardware keeps to ring order. It does not pick
and choose which frames it sends, it does them in order. That means
completion also happens in ring order. So the driver can keep a simple
linear array the size of the ring, in cachable memory, with pointers
to the skbuf. And it just needs a counting index to know which one
just completed.
Now, your hardware is more complex. You have one queue feeding
multiple switch ports. Maybe it does not keep to ring order? If you
have one port running at 10M/Half, and another at 10G/Full, does it
leave frames for the 10/Half port in the ring when its egress queue it
full? That is probably a bad idea, since the 10G/Full port could then
starve for lack of free slots in the ring? So my guess would be, the
frames get dropped. And so ring order is maintained.
If you are paranoid it could get out of sync, keep an array of tuples,
address of the frame descriptor and the skbuf. If the fd address does
not match what you expect, then do the linear search of the fd
address, and increment a counter that something odd has happened.
Andrew