On Wed, 2024-05-29 at 17:50 +0000, Sunil Kovvuri Goutham wrote:
Yes, the transmit timeout is caused by the Tx queue not waking up.
External email : Please do not click links or open attachments until
you have verified the sender or the content.
On Mon, 2024-05-27 at 17:13 +0100, Daniel Golle wrote:
davem@xxxxxxxxxxxxx>On Mon, May 27, 2024 at 03:55:55PM GMT, Sunil Kovvuri Gouthamwrote:
-----Original Message-----
From: Frank Wunderlich <linux@xxxxxxxxx>
Sent: Monday, May 27, 2024 7:52 PM
To: Felix Fietkau <nbd@xxxxxxxx>; Sean Wang <
sean.wang@xxxxxxxxxxxx>;
Mark Lee <Mark-MC.Lee@xxxxxxxxxxxx>; Lorenzo Bianconi
<lorenzo@xxxxxxxxxx>; David S. Miller <
MTK_QDMA_PAGE_SIZE;; Eric
PaoloDumazet
<edumazet@xxxxxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>;
matthias.bgg@xxxxxxxxx>;Abeni
<pabeni@xxxxxxxxxx>; Matthias Brugger <
CrispinAngeloGioacchino Del Regno <
angelogioacchino.delregno@xxxxxxxxxxxxx>
Cc: Frank Wunderlich <frank-w@xxxxxxxxxxxxxxx>; John
buffer<john@xxxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx;
linux-kernel@xxxxxxxxxxxxxxx;
linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
linux-mediatek@xxxxxxxxxxxxxxxxxxx;
Daniel Golle <daniel@xxxxxxxxxxxxxx>
Subject: [net v2] net: ethernet: mtk_eth_soc: handle dma
rarly butsize soc specific
From: Frank Wunderlich <frank-w@xxxxxxxxxxxxxxx>
The mainline MTK ethernet driver suffers long time from
sizesannoying tx
queue timeouts. We think that this is caused by fixed dma
manner.hardcoded for
all SoCs.
Use the dma-size implementation from SDK in a per SoC
MT7623
Fixes: 656e705243fd ("net-next: mediatek: add support for
ethernet")
Suggested-by: Daniel Golle <daniel@xxxxxxxxxxxxxx>
Signed-off-by: Frank Wunderlich <frank-w@xxxxxxxxxxxxxxx>
..............
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index cae46290a7ae..f1ff1be73926 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
mtk_eth
.............
@@ -1142,40 +1142,46 @@ static int mtk_init_fq_dma(struct
*eth)
cnt * soc-
tx.desc_size,ð-
phy_scratch_ring,GFP_KERNEL);
..............
- for (i = 0; i < cnt; i++) {
- dma_addr_t addr = dma_addr + i *
- struct mtk_tx_dma_v2 *txd;
+ dma_addr = dma_map_single(eth->dma_dev,
+ eth->scratch_head[j], len *
MTK_QDMA_PAGE_SIZE,
+ DMA_FROM_DEVICE);
As per commit msg, the fix is for transmit queue timeouts.
But the DMA buffer changes seems for receive pkts.
Can you please elaborate the connection here.
descriptors
*I guess* the memory window used for both, TX and RX DMA
needs to be wisely split to not risk TX queue overruns, dependingon
the
SoC speed and without hurting RX performance...
Maybe someone inside MediaTek (I've added to Cc now) and more
familiar
with the design can elaborate in more detail.
We've encountered a transmit queue timeout issue on the MT79888 andthe
have identified it as being related to the RSS feature.
We suspect this problem arises from a low level of free TX DMADs,
TX Ring alomost full.2048
Since RSS is enabled, there are 4 Rx Rings, with each containing
DMADs, totaling 8192 for Rx. In contrast, the Tx Ring has only 2048balance
DMADs. Tx DMADs will be consumed rapidly during a 10G LAN to 10G WAN
forwarding test, subsequently causing the transmit queue to stop.
Therefore, we reduced the number of Rx DMADs for each ring to
both Tx and Rx DMADs, which resolves this issue.
Okay, but it’s still not clear why it’s resulting in a transmit
timeout.
When transmit queue is stopped and after some Tx pkts in the pipeline
are flushed out, isn’t
Tx queue wakeup not happening ?
The Tx queue stops when the free counter is less than ring->thres, and
it will wake up once the free counter is greater than ring->thres.
If the CPU is too late to wake up the Tx queues, it may cause a
transmit timeout.
Therefore, we balanced the TX and RX DMADs to improve this error
situation.