Re: [net v2] net: ethernet: mtk_eth_soc: handle dma buffer size soc specific

From: Florian Fainelli
Date: Thu May 30 2024 - 00:22:12 EST




On 5/29/2024 6:46 PM, Bc-bocun Chen (陳柏村) wrote:
On Wed, 2024-05-29 at 17:50 +0000, Sunil Kovvuri Goutham wrote:

External email : Please do not click links or open attachments until
you have verified the sender or the content.
On Mon, 2024-05-27 at 17:13 +0100, Daniel Golle wrote:
On Mon, May 27, 2024 at 03:55:55PM GMT, Sunil Kovvuri Goutham
wrote:


-----Original Message-----
From: Frank Wunderlich <linux@xxxxxxxxx>
Sent: Monday, May 27, 2024 7:52 PM
To: Felix Fietkau <nbd@xxxxxxxx>; Sean Wang <
sean.wang@xxxxxxxxxxxx>;
Mark Lee <Mark-MC.Lee@xxxxxxxxxxxx>; Lorenzo Bianconi
<lorenzo@xxxxxxxxxx>; David S. Miller <
davem@xxxxxxxxxxxxx>
; Eric
Dumazet
<edumazet@xxxxxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>;
Paolo
Abeni
<pabeni@xxxxxxxxxx>; Matthias Brugger <
matthias.bgg@xxxxxxxxx>;
AngeloGioacchino Del Regno <
angelogioacchino.delregno@xxxxxxxxxxxxx>
Cc: Frank Wunderlich <frank-w@xxxxxxxxxxxxxxx>; John
Crispin
<john@xxxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx;
linux-kernel@xxxxxxxxxxxxxxx;
linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
linux-mediatek@xxxxxxxxxxxxxxxxxxx;
Daniel Golle <daniel@xxxxxxxxxxxxxx>
Subject: [net v2] net: ethernet: mtk_eth_soc: handle dma
buffer
size soc specific

From: Frank Wunderlich <frank-w@xxxxxxxxxxxxxxx>

The mainline MTK ethernet driver suffers long time from
rarly but
annoying tx
queue timeouts. We think that this is caused by fixed dma
sizes
hardcoded for
all SoCs.

Use the dma-size implementation from SDK in a per SoC
manner.

Fixes: 656e705243fd ("net-next: mediatek: add support for
MT7623
ethernet")
Suggested-by: Daniel Golle <daniel@xxxxxxxxxxxxxx>
Signed-off-by: Frank Wunderlich <frank-w@xxxxxxxxxxxxxxx>


..............

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index cae46290a7ae..f1ff1be73926 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c


.............
@@ -1142,40 +1142,46 @@ static int mtk_init_fq_dma(struct
mtk_eth
*eth)
cnt * soc-
tx.desc_size,
&eth-
phy_scratch_ring,
GFP_KERNEL);


..............
- for (i = 0; i < cnt; i++) {
- dma_addr_t addr = dma_addr + i *
MTK_QDMA_PAGE_SIZE;
- struct mtk_tx_dma_v2 *txd;
+ dma_addr = dma_map_single(eth->dma_dev,
+ eth->scratch_head[j], len *
MTK_QDMA_PAGE_SIZE,
+ DMA_FROM_DEVICE);



As per commit msg, the fix is for transmit queue timeouts.
But the DMA buffer changes seems for receive pkts.
Can you please elaborate the connection here.


*I guess* the memory window used for both, TX and RX DMA
descriptors
needs to be wisely split to not risk TX queue overruns, depending
on
the
SoC speed and without hurting RX performance...

Maybe someone inside MediaTek (I've added to Cc now) and more
familiar
with the design can elaborate in more detail.
We've encountered a transmit queue timeout issue on the MT79888 and
have identified it as being related to the RSS feature.
We suspect this problem arises from a low level of free TX DMADs,
the
TX Ring alomost full.
Since RSS is enabled, there are 4 Rx Rings, with each containing
2048
DMADs, totaling 8192 for Rx. In contrast, the Tx Ring has only 2048
DMADs. Tx DMADs will be consumed rapidly during a 10G LAN to 10G WAN
forwarding test, subsequently causing the transmit queue to stop.
Therefore, we reduced the number of Rx DMADs for each ring to
balance
both Tx and Rx DMADs, which resolves this issue.

Okay, but it’s still not clear why it’s resulting in a transmit
timeout.
When transmit queue is stopped and after some Tx pkts in the pipeline
are flushed out, isn’t
Tx queue wakeup not happening ?
Yes, the transmit timeout is caused by the Tx queue not waking up.
The Tx queue stops when the free counter is less than ring->thres, and
it will wake up once the free counter is greater than ring->thres.
If the CPU is too late to wake up the Tx queues, it may cause a
transmit timeout.
Therefore, we balanced the TX and RX DMADs to improve this error
situation.

All of this needs to go into the commit message please, thanks!
--
Florian