RE: [Xen-devel] [RFC] virtio_ring: check dma_mem for xen_domain
From: Stefano Stabellini
Date: Thu Jan 24 2019 - 14:14:59 EST
On Thu, 24 Jan 2019, Peng Fan wrote:
> Hi stefano,
>
> > -----Original Message-----
> > From: Stefano Stabellini [mailto:sstabellini@xxxxxxxxxx]
> > Sent: 2019å1æ24æ 7:44
> > To: hch@xxxxxxxxxxxxx
> > Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>; Peng Fan
> > <peng.fan@xxxxxxx>; mst@xxxxxxxxxx; jasowang@xxxxxxxxxx;
> > xen-devel@xxxxxxxxxxxxxxxxxxxx; linux-remoteproc@xxxxxxxxxxxxxxx;
> > linux-kernel@xxxxxxxxxxxxxxx; virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx;
> > luto@xxxxxxxxxx; jgross@xxxxxxxx; boris.ostrovsky@xxxxxxxxxx;
> > bjorn.andersson@xxxxxxxxxx; jliang@xxxxxxxxxx
> > Subject: Re: [Xen-devel] [RFC] virtio_ring: check dma_mem for xen_domain
> >
> > On Wed, 23 Jan 2019, hch@xxxxxxxxxxxxx wrote:
> > > On Wed, Jan 23, 2019 at 01:04:33PM -0800, Stefano Stabellini wrote:
> > > > If vring_use_dma_api is actually supposed to return true when
> > > > dma_dev->dma_mem is set, then both Peng's patch and the patch I
> > > > wrote are not fixing the real issue here.
> > > >
> > > > I don't know enough about remoteproc to know where the problem
> > > > actually lies though.
> > >
> > > The problem is the following:
> > >
> > > Devices can declare a specific memory region that they want to use
> > > when the driver calls dma_alloc_coherent for the device, this is done
> > > using the shared-dma-pool DT attribute, which comes in two variants
> > > that would be a little to much to explain here.
> > >
> > > remoteproc makes use of that because apparently the device can only
> > > communicate using that region. But it then feeds back memory obtained
> > > with dma_alloc_coherent into the virtio code. For that it calls
> > > vmalloc_to_page on the dma_alloc_coherent, which is a huge no-go for
> > > the ÄMA API and only worked accidentally on a few platform, and
> > > apparently arm64 just changed a few internals that made it stop
> > > working for remoteproc.
> > >
> > > The right answer is to not use the DMA API to allocate memory from a
> > > device-speficic region, but to tie the driver directly into the DT
> > > reserved memory API in a way that allows it to easilt obtain a struct
> > > device for it.
> >
> > If I understand correctly, Peng should be able to reproduce the problem on
> > native Linux without any Xen involvement simply by forcing
> > vring_use_dma_api to return true. Peng, can you confirm?
>
> It is another issue without xen involvement,
> There is an thread talking this: https://patchwork.kernel.org/patch/10742923/
>
> Without xen, vring_use_dma_api will return false.
> With xen, if vring_use_dma_api returns true, it will dma_map_xx and trigger dump.
It is true that for Xen on ARM DomUs it is not necessary today to return
true from vring_use_dma_api. However, returning true from
vring_use_dma_api should not break Linux. When the rpmesg issue is
fixed, this problem should also go away without any need for additional
changes on the xen side I think.