Re: [PATCH v5 11/13] xen: introduce xen_alloc/free_coherent_pages

From: Stefano Stabellini
Date: Wed Sep 11 2013 - 13:35:00 EST


On Wed, 11 Sep 2013, Catalin Marinas wrote:
> On Mon, Sep 09, 2013 at 05:46:59PM +0100, Stefano Stabellini wrote:
> > On Mon, 9 Sep 2013, Catalin Marinas wrote:
> > > >>> They could also happen in a DomU if we assign a physical device to it
> > > >>> (and an SMMU is not available).
> > > >>
> > > >> The problem is that you don't necessarily know one kind of coherency you
> > > >> know for a physical device. As I said, we plan to do this DT-driven.
> > > >
> > > > OK, but if I call arm64_swiotlb_dma_ops.alloc passing the right
> > > > arguments to it, I should be able to get the right coherency for the
> > > > right device, correct?
> > >
> > > I think it needs a bit more work on the Xen part. Basically
> > > dma_alloc_attrs() calls get_dma_ops() to obtain the best DMA operations
> > > for a device. arm64_swiotlb_dma_ops is just the default implementation
> > > and I'll add a _noncoherent variant as well. Default dma_ops will be
> > > set to one of these during boot. But a device is also allowed to have
> > > its own dev->archdata.dma_ops, set via set_dma_ops().
> > >
> > > So even if you set the default dma_ops to Xen ops, you may not get them
> > > via dma_alloc_coherent(). I don't see any easier solution other than
> > > patching the dma_alloc_attrs() function to issue a Hyp call after the
> > > memory has been allocated with the get_dma_ops()->alloc(). But I don't
> > > like this either.
> >
> > I see. This problem affects arch/arm as well.
> > Either we add an if (!xen_domain()) in get_dma_ops, or we could make
> > get_dma_ops a function pointer and let people overwrite it.
> > See below the first option implemented for arch/arm on top of the
> > swiotlb series:
> >
> >
> > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> > index 7d6e4f9..0b8b5e4 100644
> > --- a/arch/arm/include/asm/dma-mapping.h
> > +++ b/arch/arm/include/asm/dma-mapping.h
> > @@ -12,6 +12,8 @@
> > #include <asm/memory.h>
> > #include <asm/cacheflush.h>
> >
> > +#include <xen/xen.h>
> > +
> > #define DMA_ERROR_CODE (~0)
> > extern struct dma_map_ops *dma_ops;
> > extern struct dma_map_ops arm_dma_ops;
> > @@ -19,7 +21,7 @@ extern struct dma_map_ops arm_coherent_dma_ops;
> >
> > static inline struct dma_map_ops *get_dma_ops(struct device *dev)
> > {
> > - if (dev && dev->archdata.dma_ops)
> > + if (!xen_domain() && dev && dev->archdata.dma_ops)
> > return dev->archdata.dma_ops;
> > return dma_ops;
> > }
> > diff --git a/arch/arm/include/asm/xen/page-coherent.h b/arch/arm/include/asm/xen/page-coherent.h
> > index af2cf8d..c2232fe 100644
> > --- a/arch/arm/include/asm/xen/page-coherent.h
> > +++ b/arch/arm/include/asm/xen/page-coherent.h
> > @@ -9,6 +9,8 @@ static inline void *xen_alloc_coherent_pages(struct device *hwdev, size_t size,
> > dma_addr_t *dma_handle, gfp_t flags,
> > struct dma_attrs *attrs)
> > {
> > + if (hwdev && hwdev->archdata.dma_ops)
> > + return hwdev->archdata.dma_ops->alloc(hwdev, size, dma_handle, flags, attrs);
> > return arm_dma_ops.alloc(hwdev, size, dma_handle, flags, attrs);
>
> I still don't like xen digging into the archdata internals.

Me neither.


> What about:
>
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h
> index 5b579b9..5fa472c 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -15,13 +15,20 @@
> extern struct dma_map_ops arm_dma_ops;
> extern struct dma_map_ops arm_coherent_dma_ops;
>
> -static inline struct dma_map_ops *get_dma_ops(struct device *dev)
> +static inline struct dma_map_ops *__get_dma_ops(struct device *dev)
> {
> if (dev && dev->archdata.dma_ops)
> return dev->archdata.dma_ops;
> return &arm_dma_ops;
> }
>
> +static inline struct dma_map_ops *get_dma_ops(struct device *dev)
> +{
> + if (xen_domain())
> + return xen_dma_ops;
> + return __get_dma_ops(dev);
> +}

I agree that this is better.


> static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
> {
> BUG_ON(!dev);
> @@ -32,7 +39,7 @@ static inline void set_dma_ops(struct device *dev, struct dma_map_ops *ops)
>
> static inline int dma_set_mask(struct device *dev, u64 mask)
> {
> - return get_dma_ops(dev)->set_dma_mask(dev, mask);
> + return __get_dma_ops(dev)->set_dma_mask(dev, mask);
> }
>
> #ifdef __arch_page_to_dma

I don't understand the reason for this change though: shouldn't
set_dma_mask go via the "default" (whatever that is), like the others?

On native it won't make a difference, in any case it will end up calling
arm_set_dma_mask.

On Xen it would make a difference, because
get_dma_ops(dev)->set_dma_mask would end up calling
xen_swiotlb_set_dma_mask, that checks whether the mask is supported by
the swiotlb buffer before setting the mask for the device, while
obviously arm_set_dma_mask doesn't do that.


> And in xen_alloc_coherent_pages():
>
> return __get_dma_ops(dev).alloc(...);

Right.


> Alternatively, add the xen_domain() check in dma_alloc_attrs() instead
> of get_dma_ops() (and other functions like map_sg etc.)

I prefer this approach because it is going to be more concise.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/