Re: [PATCH] swiotlb: fix the check whether a device has used software IO TLB

From: Petr Tesařík
Date: Fri Sep 15 2023 - 05:13:54 EST


On Thu, 14 Sep 2023 19:28:01 +0100
Catalin Marinas <catalin.marinas@xxxxxxx> wrote:

> On Wed, Sep 13, 2023 at 02:26:56PM +0200, Petr Tesařík wrote:
> > On Wed, 13 Sep 2023 14:14:03 +0200
> > Christoph Hellwig <hch@xxxxxx> wrote:
> > > Thanks, I've applied this to get it into linux-next ASAP. I'd love
> > > to have reviews before sending it on to Linus, though.
> >
> > Fully understood, given my past record. ;-)
> >
> > @Catalin Marinas: I have added you to the list of recipient, because you
> > spotted some issues with memory barriers in one of my previous attempts.
>
> I seem to have forgotten everything about that thread (trying to
> remember what I meant in
> https://lore.kernel.org/all/ZGJdtmP13pv06xDH@xxxxxxx/ ;)).

Oh... Then I'm very much indebted to you for looking at this patch
nevertheless.

What happened is this: I originally wrote code which maintained a
per-device list of actively used swiotlb pools. The idea was rejected
in an internal review at Huawei, so I reverted to a simple flag to
reduce at least the impact on devices that do not use swiotlb. There
might be some left-overs...

> What do the smp_wmb() barriers in swiotlb_find_slots() and
> swiotlb_dyn_alloc() order? The latter is even more unclear as it's at
> the end of the function and the "pairing" comment doesn't help.

By the time swiotlb_find_slots() returns a valid slot index, the new
value of dev->dma_uses_io_tlb must be visible by all CPUs in
is_swiotlb_buffer(). The index is used to calculate the bounce buffer
address returned to device drivers. This address may be passed to
another CPU and used as an argument to is_swiotlb_buffer().

I am not sure that the smp_wmb() in swiotlb_dyn_alloc() is needed, but
let me explain my thinking. Even if the dev->dma_uses_io_tlb flag is
set, is_swiotlb_buffer() returns false unless the buffer address is
found in the RCU list of swiotlb pools, which is walked without taking
any lock. In some iterations of the patch series, there was a direct
call to swiotlb_dyn_alloc(), and a smp_wmb() was necessary to make sure
that the list of swiotlb pools cannot be observed as empty by another
CPU. You have already explained to me that taking a spin lock in
add_mem_pool() is not sufficient, because it does not invalidate a
value that might have been speculatively read by another CPU before
entering the critical section. OTOH swiotlb_dyn_alloc() is always
called through schedule_work() in the current version. If that
implicitly provides necessary barriers, this smp_wmb() can be removed.

FTR list_add_rcu() alone is not sufficient. It adds the new element
using rcu_assign_pointer(), which modifies the forward link with
smp_store_release(). However, list_for_each_entry_rcu() reads the head
with list_entry_rcu(), which is a simple READ_ONCE(); there is no
ACQUIRE semantics.

Actually, I wonder when a newly added RCU list element is guaranteed to
be visible by all CPUs. Existing documentation deals mainly with
element removal, explaining that both the old state and the new state
of an RCU list are valid after addition. Is it assumed that the old
state of an RCU list after addition is valid indefinitely?

> I found the "pairing" description in memory-barriers.txt not helping
> much and lots of code comments end up stating that some barrier is
> paired with another as if it is some magic synchronisation event. In
> general a wmb() barrier ends up paired with an rmb() one (or some other
> form of dependency, acquire semantics etc.) but they should be described
> in isolation first - what memory accesses on a CPU before and after the
> *mb() need ordering - and then how this pairing solves the accesses
> observability across multiple CPUs. The comments should also describe
> things like "ensure payload is visible before flag update" rather than
> "the wmb here pairs with the rmb over there".

Good point. I will improve my memory barrier comments accordingly.

> Presumably in swiotlb_find_slots(), the smp_wmb() needs to ensure the
> write to dev->dma_uses_io_tlb is visible before the *retpool write? For
> swiotlb_dyn_alloc(), I'm completely lost on what the smp_wmb() orders.
> You really need at least two memory accesses on a CPU for the barrier to
> make sense.

Thank you again for your time!

Petr T