Re: [PATCH v1 2/2] iommu/arm-smmu-v3: Recover ATC invalidate timeouts

From: Nicolin Chen

Date: Thu Mar 05 2026 - 20:29:55 EST


On Thu, Mar 05, 2026 at 07:41:58PM -0400, Jason Gunthorpe wrote:
> On Thu, Mar 05, 2026 at 01:15:45PM -0800, Nicolin Chen wrote:
>
> > You mean in arm_smmu_cmdq_issue_cmdlist() that issued the timed
> > out ATC command?
>
> Yes, it was my off hand thought.
>
> > So my test case was to trigger a device fault followed by an ATC
> > command. But, I found that the ATC command submission returned 0
> > while only the ISR received:
> > CMDQ error (cons 0x03000003): ATC invalidate timeout
> > arm_smmu_debugfs_atc_write: ATC_INV ret=0
> >
> > It seems difficult to insert a CMDQ_OP_CFGI_STE in the submission
> > thread?
>
> I didn't look, but I thought the CMDQ stops on the ATC invalidation,
> flags the error and the ISR NOP's the failing CMDQ entry and restarts
> it to resume the thread? Is that something else?
>
> If so you could insert the STE flush instead of a NOP

Yea, we could do a surgical STE update/flush in the ISR, bypassing
the arm_smmu_ste_writer that has dependency on "master" vs "smmu".

> Otherwise the arm_smmu_cmdq_issue_cmdlist() can just push another CMD
> to the queue and sync, it is obviously in a context that can do that.

It was actually a good idea and would make things cleaner..

But arm_smmu_cmdq_issue_cmdlist() doesn't know when to push another
CMD. In my case where ATC_INV irq occurs, the return value from the
arm_smmu_cmdq_poll_until_sync() in the Step 5 is 0, and prods/cons
are also matched. Actually, at this point that NOP ISR has already
finished.

Thanks
Nicolin