Re: [RFC PATCH] dma/swiotlb: Add helper for device driver to opt-out from swiotlb.
From: Jerome Glisse
Date: Thu Sep 17 2015 - 15:40:49 EST
On Thu, Sep 17, 2015 at 03:31:58PM -0400, Konrad Rzeszutek Wilk wrote:
> On Thu, Sep 17, 2015 at 03:07:47PM -0400, Jerome Glisse wrote:
> > On Thu, Sep 17, 2015 at 03:02:51PM -0400, Konrad Rzeszutek Wilk wrote:
> > > On Thu, Sep 17, 2015 at 02:22:38PM -0400, jglisse@xxxxxxxxxx wrote:
> > > > From: Jérôme Glisse <jglisse@xxxxxxxxxx>
> > > >
> > > > The swiotlb dma backend is not appropriate for some devices like
> > > > GPU where bounce buffer or slow dma page allocations is just not
> > > > acceptable. With that helper device drivers can opt-out from the
> > > > swiotlb and just do sane things without wasting CPU cycles inside
> > > > the swiotlb code.
> > >
> > > What if SWIOTLB is the only one available?
> >
> > On x86 no_mmu is always available and we assume that device driver
> > that would use this knows that their device can access all memory
> > with no restriction or at very least use DMA32 gfp flag.
>
> That runs afoul of the purpose of the DMA API. On x86 you may have
> an IOMMU - GART, AMD Vi, Intel VT-d, Calgary, etc which will provide
> you with the proper dma address. As the physical to bus address
> topology does not have to be 1:1.
I am well aware of that but saddly IOMMU is not as widespread as you
would think on x86, on many platform it is still disabled by default
by BIOS and linux kernel endup binding the swiotlb as default dma ops
and thus you have a 1:1 mapping btw bus and physical address. My patch
does not impact the case where you have an IOMMU, it only caters to
the case where the swiotlb is the DMA API backend.
> >
> >
> > > And what can't the devices use the TTM DMA backend which sets up
> > > buffers which don't need bounce buffer or slow dma page allocations?
> >
> > We want to get rid of this TTM code path for radeon and likely
> > nouveau. This is the motivation for that patch. Benchmark shows
> > that the TTM DMA backend is much much much slower (20% on some
> > benchmark) that the regular page allocation and going through
> > no_mmu.
>
> You end up using the DMA API scatter gather API later on though.
The DMA API scatter gather is only use for DMA buffer object and
this is a minority of the buffer object you have on today graphic
stacks and it's not use to present contiguous address to the GPU
(at least on GPU i care about). So most of the GPU object do not
use the DMA API scatter gather but the GPU hardware mmu that does
the scatter gather.
>
> I am also a bit confused on your use-case - when do you see this?
> On regular desktop machines you will use the IOMMU API most of
> the time because that hardware exists. The SWIOTLB should only
> be used on hardware that is old, odd, or perhaps virtualized.
Sadly it's not the case even recent x86 computer have the IOMMU
disabled by BIOS by default. User need to go into the bios and
enable virtualization option for the IOMMU to be enabled. I wish
that IOMMU was the default for all recent computer but it is just
not the case.
> >
> > So this is all about allowing to directly allocate page through
> > regular kernel page alloc code and not through specialize dma
> > allocator.
>
> .. What you are saying is that the intent of this patch is
> to not use TTM DMA.
>
> Are you using the SWIOTLB 99% of the time? 1%? Or is this
> related to the unfortunate patch that enabled SWIOTLB all the time?
> (If so, please please mention that in the commit, it didn't
> occur to me until just now).
Yes the patch that always enable the SWIOTLB is a pain point but
this patch also had other purpose that are now escaping my mind.
After discussion with other folks it seemed like the easiest
solution would be to opt-out from the swiotlb if it is in use.
>
> If that is the case we should attack the problem in a different
> way - see if the IOMMU API is setup? Or is that set already
> to some no_iommu option?
>
> I think what you are looking for is a simple flag telling you
> whether the IOMMU is there - in which case use the streaming
> DMA API calls (dma_map_page, etc)?
Device driver would still use dma_map_page, but this would not be
the swiotlb one but the no_mmu one which is pretty much a no op
and thus fast.
Cheers,
Jérôme
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/