Re: [Xen-devel] [RFC] virtio_ring: check dma_mem for xen_domain

From: Michael S. Tsirkin
Date: Thu Jan 24 2019 - 15:34:31 EST


On Thu, Jan 24, 2019 at 11:14:53AM -0800, Stefano Stabellini wrote:
> On Thu, 24 Jan 2019, Peng Fan wrote:
> > Hi stefano,
> >
> > > -----Original Message-----
> > > From: Stefano Stabellini [mailto:sstabellini@xxxxxxxxxx]
> > > Sent: 2019å1æ24æ 7:44
> > > To: hch@xxxxxxxxxxxxx
> > > Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>; Peng Fan
> > > <peng.fan@xxxxxxx>; mst@xxxxxxxxxx; jasowang@xxxxxxxxxx;
> > > xen-devel@xxxxxxxxxxxxxxxxxxxx; linux-remoteproc@xxxxxxxxxxxxxxx;
> > > linux-kernel@xxxxxxxxxxxxxxx; virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx;
> > > luto@xxxxxxxxxx; jgross@xxxxxxxx; boris.ostrovsky@xxxxxxxxxx;
> > > bjorn.andersson@xxxxxxxxxx; jliang@xxxxxxxxxx
> > > Subject: Re: [Xen-devel] [RFC] virtio_ring: check dma_mem for xen_domain
> > >
> > > On Wed, 23 Jan 2019, hch@xxxxxxxxxxxxx wrote:
> > > > On Wed, Jan 23, 2019 at 01:04:33PM -0800, Stefano Stabellini wrote:
> > > > > If vring_use_dma_api is actually supposed to return true when
> > > > > dma_dev->dma_mem is set, then both Peng's patch and the patch I
> > > > > wrote are not fixing the real issue here.
> > > > >
> > > > > I don't know enough about remoteproc to know where the problem
> > > > > actually lies though.
> > > >
> > > > The problem is the following:
> > > >
> > > > Devices can declare a specific memory region that they want to use
> > > > when the driver calls dma_alloc_coherent for the device, this is done
> > > > using the shared-dma-pool DT attribute, which comes in two variants
> > > > that would be a little to much to explain here.
> > > >
> > > > remoteproc makes use of that because apparently the device can only
> > > > communicate using that region. But it then feeds back memory obtained
> > > > with dma_alloc_coherent into the virtio code. For that it calls
> > > > vmalloc_to_page on the dma_alloc_coherent, which is a huge no-go for
> > > > the ÄMA API and only worked accidentally on a few platform, and
> > > > apparently arm64 just changed a few internals that made it stop
> > > > working for remoteproc.
> > > >
> > > > The right answer is to not use the DMA API to allocate memory from a
> > > > device-speficic region, but to tie the driver directly into the DT
> > > > reserved memory API in a way that allows it to easilt obtain a struct
> > > > device for it.
> > >
> > > If I understand correctly, Peng should be able to reproduce the problem on
> > > native Linux without any Xen involvement simply by forcing
> > > vring_use_dma_api to return true. Peng, can you confirm?
> >
> > It is another issue without xen involvement,
> > There is an thread talking this: https://patchwork.kernel.org/patch/10742923/
> >
> > Without xen, vring_use_dma_api will return false.
> > With xen, if vring_use_dma_api returns true, it will dma_map_xx and trigger dump.
>
> It is true that for Xen on ARM DomUs it is not necessary today to return
> true from vring_use_dma_api. However, returning true from
> vring_use_dma_api should not break Linux. When the rpmesg issue is
> fixed, this problem should also go away without any need for additional
> changes on the xen side I think.

Let less systems bypass the standard virtio logic (using feature bit
to figure out bypassing DMA API), the better.