Re: [PATCH v1 2/2] iommu/arm-smmu-v3: Recover ATC invalidate timeouts

From: Jason Gunthorpe

Date: Fri Mar 06 2026 - 08:07:19 EST


On Thu, Mar 05, 2026 at 09:06:17PM -0800, Nicolin Chen wrote:
> On Thu, Mar 05, 2026 at 09:33:47PM -0400, Jason Gunthorpe wrote:
> > On Thu, Mar 05, 2026 at 05:29:22PM -0800, Nicolin Chen wrote:
> >
> > > But arm_smmu_cmdq_issue_cmdlist() doesn't know when to push another
> > > CMD. In my case where ATC_INV irq occurs, the return value from the
> > > arm_smmu_cmdq_poll_until_sync() in the Step 5 is 0, and prods/cons
> > > are also matched. Actually, at this point that NOP ISR has already
> > > finished.
> >
> > Yes, you'd need a sneaky way to convay the error from the ISR to the
> > cmdlist code that didn't harm performance. Maybe we could come up with
> > something, but if it works replacing the NOP with flush sounds fairly
> > appealing - though can you do a single WORD edit to the STE that will
> > block translated requests? Zero EATS?
>
> Yea. I can give that a try.

This also really needs to go after the invalidation changes because it
is feasible to also edit the lockless RCU invalidation list from the
ISR and disable the ATC for the failed device too.

> > Also, will the SMMU start spamming with blocked translation events or
> > something that will need suppression too?
>
> CD.R=0 can suppress fault records, but we would need to override
> that in every CD of the device.

That's too much to do from ISR, but maybe we can do it from a WQ..

Jason