Re: [PATCH 1/1] swiotlb: Fix swiotlb_bounce() to do partial sync's correctly

From: Dominique Martinet
Date: Mon Apr 01 2024 - 03:56:27 EST


Michael Kelley wrote on Sat, Mar 30, 2024 at 04:16:30AM +0000:
> From: Dominique Martinet <dominique.martinet@xxxxxxxxxxxxxxxxx> Sent: Friday, March 29, 2024 7:56 PM
> > There are two things I don't understand here:
> > 1/ Why orig_addr would come from slot[1] ?
> >
> > We have index = (tlb_addr - mem->start) >> IO_TLB_SHIFT,
> > so index = (33 - 7) >> 5 = 26 >> 5 = 0
> >
> > As such, orig_addr = mem->slots[0].orig_addr and we'd need the offset to
> > be 30, not -2 ?
>
> mem->start is the physical address of the global pool of
> memory allocated for swiotlb buffers.

Argh.
Okay, that clears up a misunderstanding I've had since day one... I
should have done a little more reading there.
I've re-checked now and indeed mem->start comes from the pool init, and
corresponds to the reserved memory base.
(I'm not actually 100% sure reserved memory has to be aligned, but at the
very least I've never seen any that isn't on my hardware so I'll pretend
it must be without checking)

That makes much more sense with your fix, I agree offset must be
negative in that case, and it'll work out.

> > Well, either work - if we fix index to point to the next slot in the
> > negative case that's also acceptable if we're sure it's valid, but I'm
> > worried it might not be in cases there was only one slot e.g. mapping
> > [7; 34] and calling with 33 size 2 would try to access slot 1 with a
> > negative offset in your example, but slot[0] is the last valid slot.
>
> Right, but there wouldn't be one slot mapping [7; 34] if the
> alignment rules are followed when the global swiotlb memory
> pool is originally created. The low order IO_TLB_SHIFT bits
> of slot physical addresses must be zero for the arithmetic
> using shifts to work, so [7; 34] will cross a slot boundary and
> two slots are needed.

Yes, since the mem->start/slots belongs to the pool and not the mapping
this didn't make sense either; there's no problem here.

> > 2/ Why is orig_addr 37 the correct address to use for memcpy, and not
> > 33? I'd think it's off by a "minimum alignment page", for me this
> > computation only works if the dma_get_min_align size is bigger than io
> > tlb size.
>
> The swiotlb mapping operation establishes a pair-wise mapping between
> an orig_addr and tlb_addr, with the mapping extending for a specified
> number of bytes. Your example started with orig_addr = 7, and I
> posited that the mapping extends for 40 bytes.

Sure.

> I further posited that the tlb_addr returned by
> swiotlb_tbl_map_single() would be 3 to meet the min alignment
> requirement (which again only works if mem->start is 0).

Okay that's where I'm lost.
1/ I agree that swiotlb_bounce() called from swiotlb_tbl_map_single()
cannot be called with a tlb_addr past a single segment (which I'm not
sure is acceptable in itself, taking the real value of 2KB for io tlb
"pages", if the device requires 512 bytes alignment you won't be able to
access [512-2048[ ?)
2/ swiotlb_bounce() can be called from swiotlb_sync_single_for_device()
or swiotlb_sync_single_for_cpu() with no alignment check on tlb_addr,
we're just trusting that they only ever pass address within the same
constraint ?

If you assume tlb addr can only go from the start of a slot to as far as
the min alignment allows then I agree there's no more problem, but I
don't understand where that comes from.


Either way, that's not a new problem, and the old checks aren't making
this any better so as far as I'm concerned this patch is Progress:
Reviewed-by: Dominique Martinet <dominique.martinet@xxxxxxxxxxxxxxxxx>


Thanks for taking me by the hand here; if you want to keep discussing
this I'll be happy to give it a little bit more time but I might be a
little bit slower.
--
Dominique