Re: [PATCH v2] net: stmmac: allocate separate page for buffer

From: Simon Horman
Date: Thu Sep 12 2024 - 04:47:27 EST


On Tue, Sep 10, 2024 at 06:18:41PM +0530, Suraj Jaiswal wrote:
> Currently for TSO page is mapped with dma_map_single()
> and then resulting dma address is referenced (and offset)
> by multiple descriptors until the whole region is
> programmed into the descriptors.
> This makes it possible for stmmac_tx_clean() to dma_unmap()
> the first of the already processed descriptors, while the
> rest are still being processed by the DMA engine. This leads
> to an iommu fault due to the DMA engine using unmapped memory
> as seen below:
>
> arm-smmu 15000000.iommu: Unhandled context fault: fsr=0x402,
> iova=0xfc401000, fsynr=0x60003, cbfrsynra=0x121, cb=38
>
> Descriptor content:
> TDES0 TDES1 TDES2 TDES3
> 317: 0xfc400800 0x0 0x36 0xa02c0b68
> 318: 0xfc400836 0x0 0xb68 0x90000000
>
> As we can see above descriptor 317 holding a page address
> and 318 holding the buffer address by adding offset to page
> addess. Now if 317 descritor is cleaned as part of tx_clean()

Hi Suraj,

As it looks like there will be a v3 anyway, some minor nits from my side.

addess -> address

Flagged by checkpatch.pl --codespell

> then we will get SMMU fault if 318 descriptor is getting accessed.
>
> To fix this, let's map each descriptor's memory reference individually.
> This way there's no risk of unmapping a region that's still being
> referenced by the DMA engine in a later descriptor.
>
> Signed-off-by: Suraj Jaiswal <quic_jsuraj@xxxxxxxxxxx>
> ---
>
> Changes since v2:
> - Update commit text with more details.
> - fixed Reverse xmas tree order issue.
>
>
> Changes since v1:
> - Fixed function description
> - Fixed handling of return value.
>
>
> .../net/ethernet/stmicro/stmmac/stmmac_main.c | 63 ++++++++++++-------
> 1 file changed, 42 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 83b654b7a9fd..98d5a4b64cac 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -4136,21 +4136,25 @@ static bool stmmac_vlan_insert(struct stmmac_priv *priv, struct sk_buff *skb,
> /**
> * stmmac_tso_allocator - close entry point of the driver
> * @priv: driver private structure
> - * @des: buffer start address
> + * @addr: Contains either skb frag address or skb->data address
> * @total_len: total length to fill in descriptors
> * @last_segment: condition for the last descriptor
> * @queue: TX queue index
> + * @is_skb_frag: condition to check whether skb data is part of fragment or not
> * Description:
> * This function fills descriptor and request new descriptors according to
> * buffer length to fill
> + * This function returns 0 on success else -ERRNO on fail

Please consider using a "Return:" or "Returns:" section to document
return values.

Flagged by ./scripts/kernel-doc -none -Wall .../stmmac_main.c

> */
> -static void stmmac_tso_allocator(struct stmmac_priv *priv, dma_addr_t des,
> - int total_len, bool last_segment, u32 queue)
> +static int stmmac_tso_allocator(struct stmmac_priv *priv, void *addr,
> + int total_len, bool last_segment, u32 queue, bool is_skb_frag)

The line above could be trivially wrapped to <= 80 columns wide, as is
still preferred for networking code. Likewise a little further below.

Likewise elsewhere in this patch.

You can pass an option to checkpatch.pl to check for this.

> {
> struct stmmac_tx_queue *tx_q = &priv->dma_conf.tx_queue[queue];
> struct dma_desc *desc;
> u32 buff_size;
> int tmp_len;
> + unsigned char *data = addr;
> + unsigned int offset = 0;

Please consider arranging local variables in Networking code in
reverse xmas tree order - longest line to shortest.

Edward Cree's xmastree tool can be of assistance here:
https://github.com/ecree-solarflare/xmastree

>
> tmp_len = total_len;
>
> @@ -4161,20 +4165,44 @@ static void stmmac_tso_allocator(struct stmmac_priv *priv, dma_addr_t des,
> priv->dma_conf.dma_tx_size);
> WARN_ON(tx_q->tx_skbuff[tx_q->cur_tx]);
>
> + buff_size = tmp_len >= TSO_MAX_BUFF_SIZE ? TSO_MAX_BUFF_SIZE : tmp_len;

FWIIW, I think that min() would allow this the intent
of the line above to be expressed more succinctly.

> +
> if (tx_q->tbs & STMMAC_TBS_AVAIL)
> desc = &tx_q->dma_entx[tx_q->cur_tx].basic;
> else
> desc = &tx_q->dma_tx[tx_q->cur_tx];
>
> - curr_addr = des + (total_len - tmp_len);
> + offset = total_len - tmp_len;
> + if (!is_skb_frag) {
> + curr_addr = dma_map_single(priv->device, data + offset, buff_size,
> + DMA_TO_DEVICE);
> +
> + if (dma_mapping_error(priv->device, curr_addr))
> + return -ENOMEM;
> +
> + tx_q->tx_skbuff_dma[tx_q->cur_tx].buf = curr_addr;
> + tx_q->tx_skbuff_dma[tx_q->cur_tx].len = buff_size;
> + tx_q->tx_skbuff_dma[tx_q->cur_tx].map_as_page = false;
> + tx_q->tx_skbuff_dma[tx_q->cur_tx].buf_type = STMMAC_TXBUF_T_SKB;
> + } else {
> + curr_addr = skb_frag_dma_map(priv->device, addr, offset,
> + buff_size,
> + DMA_TO_DEVICE);
> +
> + if (dma_mapping_error(priv->device, curr_addr))
> + return -ENOMEM;
> +
> + tx_q->tx_skbuff_dma[tx_q->cur_tx].buf = curr_addr;
> + tx_q->tx_skbuff_dma[tx_q->cur_tx].len = buff_size;
> + tx_q->tx_skbuff_dma[tx_q->cur_tx].map_as_page = true;
> + tx_q->tx_skbuff_dma[tx_q->cur_tx].buf_type = STMMAC_TXBUF_T_SKB;
> + }

Maybe my eyes are deceiving me, but there seems to be quite a lot of
repetition in the two arms of the if/else condition above. If so, can it be
consolidated by moving everything other than the assignment of curr out of
the conditional blocks? (And dropping the {}.)

> +
> if (priv->dma_cap.addr64 <= 32)
> desc->des0 = cpu_to_le32(curr_addr);
> else
> stmmac_set_desc_addr(priv, desc, curr_addr);
>
> - buff_size = tmp_len >= TSO_MAX_BUFF_SIZE ?
> - TSO_MAX_BUFF_SIZE : tmp_len;
> -
> stmmac_prepare_tso_tx_desc(priv, desc, 0, buff_size,
> 0, 1,
> (last_segment) && (tmp_len <= TSO_MAX_BUFF_SIZE),

...