Re: [PATCH 1/2] swiotlb: Remove alloc_size argument to swiotlb_tbl_map_single()

From: Petr Tesařík
Date: Mon Apr 15 2024 - 09:27:07 EST


On Mon, 15 Apr 2024 13:03:30 +0000
Michael Kelley <mhklinux@xxxxxxxxxxx> wrote:

> From: Petr Tesařík <petr@xxxxxxxxxxx> Sent: Monday, April 15, 2024 5:50 AM
> >
> > On Mon, 15 Apr 2024 12:23:22 +0000
> > Michael Kelley <mhklinux@xxxxxxxxxxx> wrote:
> >
> > > From: Petr Tesařík <petr@xxxxxxxxxxx> Sent: Monday, April 15, 2024 4:46 AM
> > > >
> > > > Hi Michael,
> > > >
> > > > sorry for taking so long to answer. Yes, there was no agreement on the
> > > > removal of the "dir" parameter, but I'm not sure it's because of
> > > > symmetry with swiotlb_sync_*(), because the topic was not really
> > > > discussed.
> > > >
> > > > The discussion was about the KUnit test suite and whether direction is
> > > > a property of the bounce buffer or of each sync operation. Since DMA API
> > > > defines associates each DMA buffer with a direction, the direction
> > > > parameter passed to swiotlb_sync_*() should match what was passed to
> > > > swiotlb_tbl_map_single(), because that's how it is used by the generic
> > > > DMA code. In other words, if the parameter is kept, it should be kept
> > > > to match dma_map_*().
> > > >
> > > > However, there is also symmetry with swiotlb_tbl_unmap_single(). This
> > > > function does use the parameter for the final sync. I believe there
> > > > should be a matching initial sync in swiotlb_tbl_map_single(). In
> > > > short, the buffer sync for DMA non-coherent devices should be moved from
> > > > swiotlb_map() to swiotlb_tbl_map_single(). If this sync is not needed,
> > > > then the caller can (and should) include DMA_ATTR_SKIP_CPU_SYNC in
> > > > the flags parameter.
> > > >
> > > > To sum it up:
> > > >
> > > > * Do *NOT* remove the "dir" parameter.
> > > > * Let me send a patch which moves the initial buffer sync.
> > > >
> > >
> > > I'm not seeing the need to move the initial buffer sync. All
> > > callers of swiotlb_tbl_map_single() already have a subsequent
> > > check for a non-coherent device, and a call to
> > > arch_sync_dma_for_device(). And the Xen code has some
> > > special handling that probably shouldn't go in
> > > swiotlb_tbl_map_single(). Or am I missing something?
> >
> > Oh, sure, there's nothing broken ATM. It's merely a cleanup. The API is
> > asymmetric and thus confusing. You get a final sync by default if you
> > call swiotlb_tbl_unmap_single(),
>
> I don't see that final sync in swiotlb_tbl_unmap_single(). It calls
> swiotlb_bounce() to copy the data, but it doesn't deal with
> non-coherent devices or call arch_sync_dma_for_cpu().

Ouch. You're right! The buffer gets only bounced but not synced if
device DMA is non-coherent. So, how is this supposed to work?

Now I'm looking at the code in dma_direct_map_page(), and it calls
arch_sync_dma_for_device() explicitly, _except_ when using SWIOTLB. So,
maybe I should instead review all callers of swiotlb_map(), make sure
that they handle non-coherent devices, and then remove the sync from
swiotlb_map()?

I mean, the current situation seems somewhat disorganized to me.

Petr T