Re: [PATCH] swiotlb: set IO TLB segment size via cmdline

From: Roman Skakun
Date: Tue Sep 21 2021 - 11:21:40 EST


Hi Robin,

>> I use Xen PV display. In my case, PV display backend(Dom0) allocates
>> contiguous buffer via DMA-API to
>> to implement zero-copy between Dom0 and DomU.
>>
> Well, something's gone badly wrong there - if you have to shadow the
> entire thing in a bounce buffer to import it then it's hardly zero-copy,
> is it? If you want to do buffer sharing the buffer really needs to be
> allocated appropriately to begin with, such that all relevant devices
> can access it directly. That might be something which needs fixing in Xen.
>

Right, in case when we want to use a zero-copy approach need to avoid
using swiotlb
bounce buffer for all devices which is potentially using this buffer.
The root of the problem is that this buffer mapped to foreign pages
and when I tried to
retrieve dma_addr for this buffer I got a foreign MFN that bigger than
32 bit and swiotlb tries to
use bounce buffer.
I understood, that, need to find a way to avoid using swiotlb in this case.
At the moment, it's unclear how to do this properly.
But, this is another story...

I guess, we can have the situation when some device like rcar-du needs
to use a sufficiently large
buffer which is greater than 256 KB (128(CURRENT_IO_TLB_SEGMENT *
2048) and need to
adjust this parameter during boot time, not compilation time.
In order to this point, this patch was created.

Thanks,
Roman

пт, 17 сент. 2021 г. в 12:44, Robin Murphy <robin.murphy@xxxxxxx>:
>
> On 2021-09-17 10:36, Roman Skakun wrote:
> > Hi, Christoph
> >
> > I use Xen PV display. In my case, PV display backend(Dom0) allocates
> > contiguous buffer via DMA-API to
> > to implement zero-copy between Dom0 and DomU.
>
> Well, something's gone badly wrong there - if you have to shadow the
> entire thing in a bounce buffer to import it then it's hardly zero-copy,
> is it? If you want to do buffer sharing the buffer really needs to be
> allocated appropriately to begin with, such that all relevant devices
> can access it directly. That might be something which needs fixing in Xen.
>
> Robin.
>
> > When I start Weston under DomU, I got the next log in Dom0:
> > ```
> > [ 112.554471] CPU: 0 PID: 367 Comm: weston Tainted: G O
> > 5.10.0-yocto-standard+ #312
> > [ 112.575149] Call trace:
> > [ 112.577666] dump_backtrace+0x0/0x1b0
> > [ 112.581373] show_stack+0x18/0x70
> > [ 112.584746] dump_stack+0xd0/0x12c
> > [ 112.588200] swiotlb_tbl_map_single+0x234/0x360
> > [ 112.592781] xen_swiotlb_map_page+0xe4/0x4c0
> > [ 112.597095] xen_swiotlb_map_sg+0x84/0x12c
> > [ 112.601249] dma_map_sg_attrs+0x54/0x60
> > [ 112.605138] vsp1_du_map_sg+0x30/0x60
> > [ 112.608851] rcar_du_vsp_map_fb+0x134/0x170
> > [ 112.613082] rcar_du_vsp_plane_prepare_fb+0x44/0x64
> > [ 112.618007] drm_atomic_helper_prepare_planes+0xac/0x160
> > [ 112.623362] drm_atomic_helper_commit+0x88/0x390
> > [ 112.628029] drm_atomic_nonblocking_commit+0x4c/0x60
> > [ 112.633043] drm_mode_atomic_ioctl+0x9a8/0xb0c
> > [ 112.637532] drm_ioctl_kernel+0xc4/0x11c
> > [ 112.641506] drm_ioctl+0x21c/0x460
> > [ 112.644967] __arm64_sys_ioctl+0xa8/0xf0
> > [ 112.648939] el0_svc_common.constprop.0+0x78/0x1a0
> > [ 112.653775] do_el0_svc+0x24/0x90
> > [ 112.657148] el0_svc+0x14/0x20
> > [ 112.660254] el0_sync_handler+0x1a4/0x1b0
> > [ 112.664315] el0_sync+0x174/0x180
> > [ 112.668145] rcar-fcp fea2f000.fcp: swiotlb buffer is full (sz:
> > 3686400 bytes), total 65536 (slots), used 112 (slots)
> > ```
> > The problem is happened here:
> > https://elixir.bootlin.com/linux/v5.14.4/source/drivers/gpu/drm/rcar-du/rcar_du_vsp.c#L202
> >
> > Sgt was created in dma_get_sgtable() by dma_common_get_sgtable() and
> > includes a single page chunk
> > as shown here:
> > https://elixir.bootlin.com/linux/v5.14.5/source/kernel/dma/ops_helpers.c#L18
> >
> > After creating a new sgt, we tried to map this sgt through vsp1_du_map_sg().
> > Internally, vsp1_du_map_sg() using ops->map_sg (e.g
> > xen_swiotlb_map_sg) to perform
> > mapping.
> >
> > I realized that required segment is too big to be fitted to default
> > swiotlb segment and condition
> > https://elixir.bootlin.com/linux/latest/source/kernel/dma/swiotlb.c#L474
> > is always false.
> >
> > I know that I use a large buffer, but why can't I map this buffer in one chunk?
> >
> > Thanks!
> >
> > ср, 15 сент. 2021 г. в 16:53, Christoph Hellwig <hch@xxxxxx>:
> >>
> >> On Wed, Sep 15, 2021 at 03:49:52PM +0200, Jan Beulich wrote:
> >>> But the question remains: Why does the framebuffer need to be mapped
> >>> in a single giant chunk?
> >>
> >> More importantly: if you use dynamic dma mappings for your framebuffer
> >> you're doing something wrong.
> >
> >
> >



--
Best Regards, Roman.