Re: [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config

From: Andy Shevchenko
Date: Fri May 15 2020 - 07:02:39 EST


On Tue, May 12, 2020 at 10:47:34PM +0300, Serge Semin wrote:
> On Tue, May 12, 2020 at 10:12:08PM +0300, Andy Shevchenko wrote:
> > On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote:
> > > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote:
> > > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote:
> > > > > IP core of the DW DMA controller may be synthesized with different
> > > > > max burst length of the transfers per each channel. According to Synopsis
> > > > > having the fixed maximum burst transactions length may provide some
> > > > > performance gain. At the same time setting up the source and destination
> > > > > multi size exceeding the max burst length limitation may cause a serious
> > > > > problems. In our case the system just hangs up. In order to fix this
> > > > > lets introduce the max burst length platform config of the DW DMA
> > > > > controller device and don't let the DMA channels configuration code
> > > > > exceed the burst length hardware limitation. Depending on the IP core
> > > > > configuration the maximum value can vary from channel to channel.
> > > > > It can be detected either in runtime from the DWC parameter registers
> > > > > or from the dedicated dts property.
> > > >
> > > > I'm wondering what can be the scenario when your peripheral will ask something
> > > > which is not supported by DMA controller?
> > >
> > > I may misunderstood your statement, because seeing your activity around my
> > > patchsets including the SPI patchset and sometimes very helpful comments,
> > > this question answer seems too obvious to see you asking it.
> > >
> > > No need to go far for an example. See the DW APB SSI driver. Its DMA module
> > > specifies the burst length to be 16, while not all of ours channels supports it.
> > > Yes, originally it has been developed for the Intel Midfield SPI, but since I
> > > converted the driver into a generic code we can't use a fixed value. For instance
> > > in our hardware only two DMA channels of total 16 are capable of bursting up to
> > > 16 bytes (data items) at a time, the rest of them are limited with up to 4 bytes
> > > burst length. While there are two SPI interfaces, each of which need to have two
> > > DMA channels for communications. So I need four channels in total to allocate to
> > > provide the DMA capability for all interfaces. In order to set the SPI controller
> > > up with valid optimized parameters the max-burst-length is required. Otherwise we
> > > can end up with buffers overrun/underrun.
> >
> > Right, and we come to the question which channel better to be used by SPI and
> > the rest devices. Without specific filter function you can easily get into a
> > case of inverted optimizations, when SPI got channels with burst = 4, while
> > it's needed 16, and other hardware otherwise. Performance wise it's worse
> > scenario which we may avoid in the first place, right?
>
> If we start thinking like you said, we'll get stuck at a problem of which interfaces
> should get faster DMA channels and which one should be left with slowest. In general
> this task can't be solved, because without any application-specific requirement
> they all are equally valuable and deserve to have the best resources allocated.
> So we shouldn't assume that some interface is better or more valuable than
> another, therefore in generic DMA client code any filtering is redundant.

True, that's why I called it platform dependent quirks. You may do whatever you
want / need to preform on your hardware best you can. If it's okay for your
hardware to have this inverse optimization, than fine, generic DMA client
should really not care about it.

> > > > Peripheral needs to supply a lot of configuration parameters specific to the
> > > > DMA controller in use (that's why we have struct dw_dma_slave).
> > > > So, seems to me the feasible approach is supply correct data in the first place.
> > >
> > > How to supply a valid data if clients don't know the DMA controller limitations
> > > in general?
> >
> > This is a good question. DMA controllers are quite different and having unified
> > capabilities structure for all is almost impossible task to fulfil. That's why
> > custom filter function(s) can help here. Based on compatible string you can
> > implement whatever customized quirks like two functions, for example, to try 16
> > burst size first and fallback to 4 if none was previously found.
>
> Right. As I said in the previous email it's up to the corresponding platforms to
> decide the criteria of the filtering including the max-burst length value.

Correct!

> Even though the DW DMA channels resources aren't uniform on Baikal-T1 SoC I also
> won't do the filter-based channel allocation, because I can't predict the SoC
> application. Some of them may be used on a platform with active SPI interface
> utilization, some with specific requirements to UARTs and so on.

It's your choice as platform maintainer.

> > > > If you have specific channels to acquire then you probably need to provide a
> > > > custom xlate / filter functions. Because above seems a bit hackish workaround
> > > > of dynamic channel allocation mechanism.
> > >
> > > No, I don't have a specific channel to acquire and in general you may use any
> > > returned from the DMA subsystem (though some platforms may need a dedicated
> > > channels to use, in this case xlate / filter is required). In our SoC any DW DMAC
> > > channel can be used for any DMA-capable peripherals like SPI, I2C, UART. But the
> > > their DMA settings must properly and optimally configured. It can be only done
> > > if you know the DMA controller parameters like max burst length, max block-size,
> > > etc.
> > >
> > > So no. The change proposed by this patch isn't workaround, but a useful feature,
> > > moreover expected to be supported by the generic DMA subsystem.
> >
> > See above.
> >
> > > > But let's see what we can do better. Since maximum is defined on the slave side
> > > > device, it probably needs to define minimum as well, otherwise it's possible
> > > > that some hardware can't cope underrun bursts.
> > >
> > > There is no need to define minimum if such limit doesn't exists except a
> > > natural 1. Moreover it doesn't exist for all DMA controllers seeing noone has
> > > added such capability into the generic DMA subsystem so far.
> >
> > There is a contract between provider and consumer about DMA resource. That's
> > why both sides should participate in fulfilling it. Theoretically it may be a
> > hardware that doesn't support minimum burst available in DMA by a reason. For
> > such we would need minimum to be provided as well.
>
> I don't think 'theoretical' consideration counts when implementing something in
> kernel. That 'theoretical' may never happen, but you'll end up supporting a
> dummy functionality. Practicality is what kernel developers normally place
> before anything else.

The point here is to avoid half-baked solutions.

I'm not against max-burst logic on top of the existing interface, but would be
better if we allow the range, in this case it will work for any DMA controller
(as be part of DMA engine family).

I guess we need summarize this very long discussion and settle the next steps.

(if you can provide in short form anybody can read in 1 minute it would be
nice, I already forgot tons of paragraphs you sent here, esp. taking into
account tons of paragraphs in the other Baikal related series)

--
With Best Regards,
Andy Shevchenko