Re: [PATCH v2 5/6] powerpc/pseries/iommu: Make use of DDW even if it does not map the partition

From: Leonardo Bras
Date: Fri Jun 26 2020 - 13:55:57 EST


On Fri, 2020-06-26 at 12:23 -0300, Leonardo Bras wrote:
> On Wed, 2020-06-24 at 03:24 -0300, Leonardo Bras wrote:
> > As of today, if a DDW is created and can't map the whole partition, it's
> > removed and the default DMA window "ibm,dma-window" is used instead.
> >
> > Usually this DDW is bigger than the default DMA window, so it would be
> > better to make use of it instead.
> >
> > Signed-off-by: Leonardo Bras <leobras.c@xxxxxxxxx>
> > ---
>
> I tested this change with a 256GB DDW which did not map the whole
> partition, with a MT27700 Family [ConnectX-4 Virtual Function].
>
> I noticed the performance improvement is about the same as using DDW
> with IOMMU bypass.
>
> 64 thread write throughput: +203.0%
> 64 thread read throughput: +17.5%
> 1 thread write throughput: +20.5%
> 1 thread read throughput: +3.43%
> Average write latency: -23.0%
> Average read latency: -2.26%

The above improvements are based on the default DMA window, which is
currently used if DDW can't map the whole partition.

Those values are an average of 20 tests for each environment, 30
seconds each test.

I also did some intense testing, for 5 hour each:
64 thread write throughput
64 thread read throughput

The throughput values are stable in the whole test, and I noticed no
error on dmesg / journalctl.