Re: [PATCH] iommu/arm-smmu: Don't disable next-page prefetcher on devices it works on

From: Doug Anderson
Date: Fri May 17 2024 - 13:19:59 EST


Hi,

On Fri, May 17, 2024 at 9:37 AM Will Deacon <will@xxxxxxxxxx> wrote:
>
> Hi Doug,
>
> On Mon, May 13, 2024 at 04:13:47PM -0700, Douglas Anderson wrote:
> > On sc7180 trogdor devices we get a scary warning at bootup:
> > arm-smmu 15000000.iommu:
> > Failed to disable prefetcher [errata #841119 and #826419], check ACR.CACHE_LOCK
> >
> > We spent some time trying to figure out how we were going to fix these
> > errata and whether we needed to do a firmware update. Upon closer
> > inspection, however, we realized that the errata don't apply to us.
> > Specifically, the errata document says that for these errata:
> > * Found in: r0p0
> > * Fixed in: r2p2
> >
> > ...and trogdor devices appear to be running r2p4. That means that they
> > are unaffected despite the scary warning.
> >
> > The issue is that the kernel unconditionally tries to disable the
> > prefetcher even on unaffected devices and then warns when it's unable
> > to.
> >
> > Let's change the kernel to only disable the prefetcher on affected
> > devices, which will get rid of the scary warning on devices that are
> > unaffected. As per the comment the prefetcher is
> > "not-particularly-beneficial" but it shouldn't hurt to leave it on for
> > devices where it doesn't cause problems.
> >
> > Fixes: f87f6e5b4539 ("iommu/arm-smmu: Warn once when the perfetcher errata patch fails to apply")
> > Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
> > ---
> >
> > drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 21 +++++++++++++--------
> > 1 file changed, 13 insertions(+), 8 deletions(-)
>
>
> Just curious, but did you see any performance impact (good or bad) as a
> result of this patch? The next-page prefetcher has always looked a little
> naive to me and, with a tendency for tiny TLBs in some implementations,
> there's a possibility it could do more harm than good.

This patch actually makes no difference on trogdor today other than
getting rid of the scary warning. Specifically on trogdor the
ACR.CACHE_LOCK bit seems to be set so the kernel is unable to change
the setting anyway and has never been able to. We are working on
figuring out how to fix the firmware and then we have to get a
firmware spin before we can really see any changes. I'll keep an eye
out to see if performance numbers change when the firmware uprevs.

BTW: any idea how big of a deal these errata are? We're _just_
finishing a firmware uprev process and there is always pushback
against kicking off a new one unless the issue is important. Given
that we've been living with this issue since devices shipped I'm going
to assume we don't need to rush a firmware update, but if this is
really scary and needs to be addressed sooner we can figure that out.

-Doug