Re: [PATCH v3 1/1] swiotlb: Reduce swiotlb pool lookups
From: Petr Tesařík
Date: Tue Jul 09 2024 - 15:08:32 EST
On Tue, 9 Jul 2024 13:18:12 +0200
Christoph Hellwig <hch@xxxxxx> wrote:
> On Tue, Jul 09, 2024 at 11:10:13AM +0200, Petr Tesařík wrote:
> > Reviewed-by: Petr Tesarik <petr@xxxxxxxxxxx>
>
> Thanks.
>
> >
> > OK, so __swiotlb_find_pool() is now always declared (so the code
> > compiles), but if CONFIG_SWIOTLB_DYNAMIC=n, it is never defined. The
> > result still links, because the compiler optimizes away the whole
> > if-clause, so there are no references to an undefined symbol in the
> > object file.
> >
> > I think I've already seen similar constructs elsewhere in the kernel,
> > so relying on the optimization seems to be common practice.
>
> Yes, it's a pretty common patter. It's gone here now, though to not
> add the struct device field unconditionally.
>
> > > +{
> > > + struct io_tlb_pool *pool = swiotlb_find_pool(dev, addr);
> > > +
> > > + if (unlikely(pool))
> > > + __swiotlb_tbl_unmap_single(dev, addr, size, dir, attrs, pool);
> > > +}
> > > +
> > > +static inline void swiotlb_sync_single_for_device(struct device *dev,
> > > + phys_addr_t addr, size_t size, enum dma_data_direction dir)
> > > +{
> > > + struct io_tlb_pool *pool = swiotlb_find_pool(dev, addr);
> > > +
> > > + if (unlikely(pool))
> > > + __swiotlb_sync_single_for_device(dev, addr, size, dir, pool);
> >
> > We're adding an unlikely() here, which wasn't originally present in
> > iommu_dma_sync_single_for_device(). OTOH it should do no harm, and it
> > was most likely an omission.
>
> I'm honestly not a big fan of the unlikely annotations unlike they
> are proven to make a difference. Normally the runtime branch predictor
> should do a really good job here, and for some uses this will not
> just be likely but the only case.
I'm confused. If you're not a big fan, why are we effectively adding
them to more places now than before the patch?
FWIW on all architectures I've worked with (arm64, ia64, powerpc,
s390x, x86_64), the dynamic branch predictor takes precedence over
static prediction (yes, even ia64 held on to those mandatory .spnt.few
and .sptk.many hints only if there was no entry in the branch
predictor). But on every branch instruction, the compiler must decide
which code follows the branch instruction and which code is placed at
another place. The latter often needs another jump instruction to
continue executing instructions _after_ the whole conditional clause,
which is slightly less efficient. The compiler must make this decision
even if there is no likely() or unlikely() annotation, but in that case
it uses some (rather crude) heuristics of its own, e.g. whether the
condition covers the majority or minority of the value range.
Long story short, in this case "if (pool)" does pretty much the same
job as "if (likely(pool))". But we say "if (unlikely(pool)))"...
You needn't change the code, but I want this documented somewhere that
people can find it, i.e. in LKML archives.
Petr T