Re: [PATCH 00/17] dmaengine: dw-edma: Support dynamic LL appends
From: Koichiro Den
Date: Mon Jun 22 2026 - 03:39:08 EST
On Tue, Jun 16, 2026 at 12:40:54AM +0900, Koichiro Den wrote:
> Hi,
>
> This series is a reworked version of Frank's earlier RFT series:
>
> https://lore.kernel.org/dmaengine/20260109-edma_dymatic-v1-0-9a98c9c98536@xxxxxxx/
>
> After discussing the HDMA test results with Frank, I am sending this as a
> standalone series that keeps the main dynamic-append direction, while adding the
> fixes and HDMA handling needed to make it work reliably on both eDMA and HDMA.
>
> Several patches are kept from, or based on, Frank's RFT series; the individual
> patches carry the corresponding attribution.
>
> The series has been tested on both eDMA and HDMA systems. Both completed the fio
> test set reliably; performance results are shown below.
>
>
> Dependencies
> ============
>
> 1). [PATCH v7 0/9] dmaengine: Add new API to combine configuration and descriptor preparation
> https://lore.kernel.org/dmaengine/20260521-dma_prep_config-v7-0-1f73f4899883@xxxxxxx/
>
> 2). [PATCH v2 00/11] dmaengine: dw-edma: flatten desc structions and simple code
> https://lore.kernel.org/dmaengine/20260109-edma_ll-v2-0-5c0b27b2c664@xxxxxxx/
>
>
> Performance measurements
> ========================
Hi Frank, Niklas, all,
I am looking for a good way to stress PCIe controller DMA engines, such as
eDMA/HDMA, and measure their upper-bound throughput.
nvmet_pci_epf is useful since it is a real in-tree consumer, but it is not a
very direct benchmark for the DMA engine itself. So I wonder if
pci_endpoint_test would be a reasonable place to add an opt-in DMA performance
mode.
One possible option I have in mind is:
- a new fixture, pci_ep_dma_perf
- opt-in execution, for example with PCITEST_PERF=1 environment variable
- a few variants such as single and sg, possibly with a few knobs:
- PCITEST_PERF_NUM_WORKERS, to use multiple EP-side workers
- PCITEST_PERF_NUM_CHANS, to use multiple DMA channels
- perhaps other knobs for SG entry size, number of entries, etc.
- the new tests: READ_PERF_TEST and WRITE_PERF_TEST
For the other possible places I could think of, this still seems to fit best in
pci_endpoint_test. For example, extending dmatest does not seem to fit well
because this needs both EP and RC side setup. A separate kselftest also feels
like it would duplicate a lot of pci_endpoint_test code. That said, I might be
missing something.
What do you think? Any thoughts or suggestions would be much appreciated.
Best regards,
Koichiro