RE: [PATCHv4] DMAEngine: Define interleaved transfer request api

From: Bounine, Alexandre
Date: Tue Oct 18 2011 - 13:58:07 EST


On Tue, Oct 18, 2011 at 7:51 AM, Jassi Brar <jaswinder.singh@xxxxxxxxxx> wrote:
>
> On 18 October 2011 15:19, Russell King <rmk@xxxxxxxxxxxxxxxx> wrote:
>
> >> >> > With item #1 above being a separate topic, I may have a problem
> with #2
> >> >> > as well: dma_addr_t is sized for the local platform and not
> guaranteed
> >> >> > to be a 64-bit value (which may be required by a target).
> >> >> > Agree with #3 (if #1 and #2 work).
> >> >> >
> >> >> Perhaps simply change dma_addr_t to u64 in dmaengine.h alone ?
> >> >
> >> > That's just an idiotic suggestion - there's no other way to put
> that.
> >> > Let's have some sanity here.
> >> >
> >> Yeah, I am not proud of the workaround, so I only probed the option.
> >> I think I need to explain myself.
> >>
> >> The case here is that even a 32-bit RapidIO host could ask transfer
> against
> >> 64-bit address space on a remote device. And vice versa 64->32.
> >>
> >> > dma_addr_t is the size of a DMA address for the CPU architecture
> being
> >> > built.  This has no relationship to what any particular DMA engine
> uses.
> >> >
> >> Yes, so far the dmaengine ever only needed to transfer within
> platform's
> >> address-space. So the assumption that src and dst addresses could
> >> be contained within dma_addr_t, worked.
> >> If the damengine is to get rid of that assumption/constraint, the
> memcpy,
> >> slave_sg etc need to accept addresses specified in bigger of the
> host and
> >> remote address space, and u64 is the safe option.
> >> Ultimately dma_addr_t is either u32 or u64.
> >
> > Let me spell it out:
> >
> > 1. Data structures read by the DMA engine hardware should not be
> defined
> >   using the 'dma_addr_t' type, but one of the [bl]e{8,16,32,64}
> types,
> >   or at a push the u{8,16,32,64} types if they're always host-endian.
> >
> >   This helps to ensure that the layout of the structures read by the
> >   hardware are less dependent of the host architecture and each
> element
> >   is appropriately sized (and, with sparse and the endian-sized
> types,
> >   can be endian-checked at compile time.)
> >
> > 2. dma_addr_t is the size of the DMA address for the host
> architecture.
> >   This may be 32-bit or 64-bit depending on the host architecture.
> >
> > The following points are my opinion:
> >
> > 3. For architectures where there are only 32-bit DMA addresses,
> dma_addr_t
> >   will be a 32-bit type.  For architectures where there are 64-bit
> DMA
> >   addresses, it will be a 64-bit type.
> >
> > 4. If RIO can accept 64-bit DMA addresses but is only connected to
> 32-bit
> >   busses, then the top 32 address bits are not usable (it's truncated
> in
> >   hardware.)  So there's no point passing around a 64-bit DMA
> address.
> >
> > 5. In the case of a 64-bit dma_addr_t and a 32-bit DMA engine host
> being
> >   asked to transfer >= 4GB, this needs error handing in the DMA
> engine
> >   driver (I don't think its checked for - I know amba-pl08x doesn't.)
> >
> > 6. 32-bit dma_addr_t with 64-bit DMA address space is a problem and
> is
> >   probably a bug in itself - the platform should be using a 64-bit
> >   dma_addr_t in this case.  (see 3.)
> >
> Thanks for the detailed explanation.
>
> RapidIO is a packet switched interconnect with parallel or serial
> interface.
> Among other things, a packet contains 32, 48 or 64 bit offset into the
> remote-endpoint's address space. So I don't get how any of the above
> 6 points apply here.
>
> Though I agree it is peculiar for a networking technology to expose a
> DMAEngine interface. But I assume Alex has good reasons for it, who
> knows RIO better than us.
To keep it simple, look at this as a peer-to-peer networking with HW ability
to directly address memory of the link partner.

RapidIO supports messaging which is closer to traditional networking and is
supported by RIONET driver (pretending to be Ethernet). But in some situations
messaging cannot be used. In these cases addressed memory read/write
operations take place.

I would like to put a simple example of RIO based system that may help to
understand our DMA requirements.

Consider a platform with one host CPU and several DSP cards connected
to it through a switched backplane (transparent for purpose of this example).

The host CPU has one or more RIO-capable DMA channel and runs device
drivers for connected DSP cards. Each device driver is required to load
an individual program code into corresponding DSP(s). Directly addressed
writes have a lot of sense.

After DSP code is loaded device drivers start DSP program and may
participate in data transfers between DSP cards and host CPU. Again
messaging type transfers may add unnecessary overhead here compared
to direct data reads/writes.

Configuration of each DSP card may be different but from host's
POV is RIO spec compliant.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/