Re: [PATCH 05/11] PCI: epf-{mhi/test}: Move DMA initialization to EPC init callback
From: Niklas Cassel
Date: Wed Mar 27 2024 - 07:39:23 EST
+CC Vinod
Hello Vinod,
I didn't know the answer, so I chose the "call a friend option" ;)
I hope that you can help me out :)
If you take a look at drivers/pci/endpoint/functions/pci-epf-test.c
https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/endpoint/functions/pci-epf-test.c#L448-L471
You can see that the driver always does pci_epc_map_addr(),
then it will either use:
DMA API, e.g. dma_map_single() etc.
or
memcpy_fromio()/memcpy_toio()
based on flag FLAG_USE_DMA.
This flag is set via ioctl, so if we run:
/usr/bin/pcitest -d
the flag will be set, without the -d parameter the flag won't be set.
If you look at how the DMA channel is requested:
https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/endpoint/functions/pci-epf-test.c#L224-L258
If will try to get a private DMA channel, if that fails,
it will use the "dummy memcpy" DMA channel.
If the FLAG_USE_DMA is set, the transfers itself will use:
https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/endpoint/functions/pci-epf-test.c#L139-L155
either dmaengine_prep_slave_single() or dmaengine_prep_dma_memcpy(),
depending on if we are using "dummy memcpy" or not.
If you take e.g. the DWC PCIe EP controller, it can have an embedded DMA
controller on the PCIe controller, and we will try to detect it when
initializing the PCIe EP controller using dw_pcie_edma_detect():
https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/controller/dwc/pcie-designware-ep.c#L759
For the PCIe EP controller that I am using, which have eDMA built-in,
I noticed that if I do not enable the eDMA driver (# CONFIG_DW_EDMA is not
set), I noticed that I can still run:
/usr/bin/pcitest -d
Which will use the "dummy memcpy" DMA channel.
Yes, the performance is poor, but it still works, so it appears that the
fallback code is working properly.
If I enable the eDMA driver (CONFIG_DW_EDMA=y),
I can run:
/usr/bin/pcitest -d
And the performance is good.
So my question is:
Is the "dummy memcpy" DMA channel always available?
Because if it is, I think we could drop the path in the pci-epf-test.c
driver which uses memcpy_fromio()/memcpy_toio() instead of DMA API.
(Since just having a single path to do I/O in the driver would simplify
the driver IMO.)
I assume that the "dummy memcpy" DMA channel just uses memcpy_fromio() and
memcpy_toio() under the hood, so I assume that using the memcpy_fromio()/
memcpy_toio/() is equivalent to using DMA API + dmaengine_prep_dma_memcpy().
Although it would be nice if we didn't need to have the two separate paths
in pci_epf_test_data_transfer() (dmaengine_prep_slave_single() vs
dmaengine_prep_dma_memcpy()) to support the "dummy memcpy" channel.
But I guess that is not possible...
I hope that you can bring some clarity Vinod.
(Please read my replies to Mani below before you compose your email,
as it does provide more insight to this mess.)
Mani, I tried to reply to you inline below, with my limited understanding
of how dmaengine works.
On Wed, Mar 27, 2024 at 11:48:19AM +0530, Manivannan Sadhasivam wrote:
> > So we still want to test:
> > -DMA API using the eDMA
> > -DMA API using the "dummy" memcpy dma-channel.
> >
>
> IMO, the test driver should just test one form of data transfer. Either CPU
> memcpy (using iATU or something similar) or DMA. But I think the motive behind
> using DMA memcpy is that to support platforms that do not pass DMA slave
> channels in devicetree.
>
> It is applicable to test driver but not to MHI driver since all DMA supported
> MHI platforms will pass the DMA slave channels in devicetree.
I don't understand how device tree is relevant here, e.g. qcom-ep.c
specifies pcie_ep->pci.edma.nr_irqs = 1;
https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/controller/dwc/pcie-qcom-ep.c#L818
which is sufficient for you to be able to probe/detect eDMA successfully,
no need for anything in device tree at all.
>
> > However, it seems like both pci-epf-mhi.c and pci-epf-test.c
> > do either:
> > -Use DMA API
> > or
> > -Use memcpy_fromio()/memcpy_toio() instead of DMA API
> >
> >
> > To me, it seems like we should always be able to use
> > DMA API (using either a eDMA or "dummy" memcpy).
> >
>
> No, there are platforms that don't support DMA at all. Like Qcom SDX55, so we
> still need to do CPU memcpy.
I assume that you mean the the PCIe controller used in SDX55 does not
have the eDMA on the PCIe controller, so dw_pcie_edma_detect() will
fail to detect any eDMA. That is fine no?
I assume that this SoC will still able to use the "dummy" memcpy dma-channel?
>
> > I don't really see the need to have the path that does:
> > memcpy_fromio()/memcpy_toio().
> >
> > I know that for DWC, when using memcpy (and this also
> > memcpy via DMA API), we need to map the address using
> > iATU first.
> >
> > But that could probably be done using another flag,
> > perhaps rename that flag FLAG_USE_DMA to NEEDS_MAP or
> > something.
> > (Such that we can change these drivers to only have a
> > code path that uses DMA API.)
> > (...and making sure that inheriting the DMA mask does
> > not affect the DMA mask for DMA_MEMCPY.)
I was wrong here, pci-epf-test always calls pci_epc_map_addr()
regardless if FLAG_USE_DMA is set or not.
(Even though this should be unnecessary when using the eDMA.)
However, if we look at pci-epf-mhi.c we can see that it does
NOT call pci_epc_map_addr() when using DMA API + dmaengine.
Is it really safe to avoid pci_epc_map_addr() in all EPC controllers?
I assume that it should be safe for all "real" DMA channels.
We can see that it is not safe when using DMA API + "dummy" memcpy
dma-channel. (That is why I was asking if we need a NEEDS_MAP, or
MAP_NOT_NEEDED flag.)
> >
> > But perhaps I am missing something... and DMA_MEMCPY is
> > not always available?
Right now pci-epf-test driver has three ways:
-DMA API + dmaengine dmaengine_prep_slave_single()
-DMA API + dmaengine dmaengine_prep_dma_memcpy()
-memcpy_toio()/memcpy_fromio().
pci-epf-mhi.c driver has two ways:
-DMA API + dmaengine dmaengine_prep_slave_single()
-memcpy_toio()/memcpy_fromio().
pci-epf-test.c:
-Always calls pci_epc_map_addr() when using DMA API.
pci-epf-mhi.c:
-Never calls pci_epc_map_addr() when using DMA API.
I honestly don't see any point of having three paths
for pci-epf-test. Ideally I would want one, max two.
If you think that:
-DMA API + dmaengine dmaengine_prep_slave_single()
+
-memcpy_toio()/memcpy_fromio().
is more logical than:
-DMA API + dmaengine dmaengine_prep_slave_single()
+
-DMA API + dmaengine dmaengine_prep_dma_memcpy()
Then I think we should rip out the:
-DMA API + dmaengine dmaengine_prep_dma_memcpy()
it serves no purpose... if you don't have a "real" DMA channel,
just run without the -d flag.
Or, if you argue that the dmaengine_prep_dma_memcpy() is there
to test the DMA API code (which I can't say that it does, since
it doesn't use the exact same code path as a "real" DMA channel, see:
https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/endpoint/functions/pci-epf-test.c#L139-L155
so this argument is questionable).
Put it under a --use_dummy_dma, and return failure by default
if no "real" DMA channel is found.
But even so, that would not address the pci-epf-test and
pci-mhi-test inconsistency WRT pci_epc_map_addr().
I think if we rip out:
-DMA API + dmaengine dmaengine_prep_dma_memcpy()
we could also move the pci_epc_map_addr() so that it is
only used for the memcpy_toio()/memcpy_fromio() path.
(Or if we add a --use_dummy_dma, we can move the pci_epc_map_addr() to
that path, and remove it from the dmaengine_prep_slave_single() path.)
Kind regards,
Niklas