Re: [PATCH] iommu/riscv: Replace illegal command with dummy IOFENCE to prevent hardware lockup

From: Zong Li

Date: Sun Jun 28 2026 - 21:59:45 EST

On Fri, Jun 26, 2026 at 9:50 PM <fangyu.yu@xxxxxxxxxxxxxxxxx> wrote:
>
> >On Wed, Jun 24, 2026 at 10:08 PM <fangyu.yu@xxxxxxxxxxxxxxxxx> wrote:
> >>
> >> >When the RISC-V IOMMU encounters an illegal command, the hardware
> >> >stops processing and the HEAD register remains pointing at the
> >> >illegal command. If software does not handle this properly, the
> >> >hardware will be stuck at this index indefinitely, preventing any
> >> >further command queue operations.
> >> >
> >> >This patch implements a recovery mechanism by replacing the illegal
> >> >command with a dummy IOFENCE instruction (all operands are zero):
> >> >
> >> >1. Prevents hardware lockup: By overwriting the illegal command with
> >> > a valid instruction, the hardware can continue processing from the
> >> > current position instead of being stuck.
> >> >
> >> >2. Enables user recovery: After replacing the illegal command, the
> >> > user/driver has an opportunity to retry the original failed
> >> > operation rather than losing all queued work.
> >> >
> >> >3. Minimal hardware impact: A dummy IOFENCE behaves as a NOP, it
> >> > it performs no cache invalidation operations and has no side
> >> > effects on the system state. This is the safest replacement
> >> > instruction.
> >> >
> >> >Signed-off-by: Zong Li <zong.li@xxxxxxxxxx>
> >> >---
> >> > drivers/iommu/riscv/iommu.c | 24 +++++++++++++++++++++++-
> >> > 1 file changed, 23 insertions(+), 1 deletion(-)
> >> >
> >> >diff --git a/drivers/iommu/riscv/iommu.c b/drivers/iommu/riscv/iommu.c
> >> >index cec3ddd7ab10..6305ec5f467b 100644
> >> >--- a/drivers/iommu/riscv/iommu.c
> >> >+++ b/drivers/iommu/riscv/iommu.c
> >> >@@ -464,13 +464,35 @@ static unsigned int riscv_iommu_queue_send(struct riscv_iommu_queue *queue,
> >> > static irqreturn_t riscv_iommu_cmdq_process(int irq, void *data)
> >> > {
> >> > const struct riscv_iommu_queue *queue = (struct riscv_iommu_queue *)data;
> >> >- unsigned int ctrl;
> >> >+ struct riscv_iommu_command cmd;
> >> >+ unsigned int ctrl, head;
> >> >
> >> > /* Clear MF/CQ errors, complete error recovery to be implemented. */
> >> > ctrl = riscv_iommu_readl(queue->iommu, queue->qcr);
> >> > if (ctrl & (RISCV_IOMMU_CQCSR_CQMF | RISCV_IOMMU_CQCSR_CMD_TO |
> >> > RISCV_IOMMU_CQCSR_CMD_ILL | RISCV_IOMMU_CQCSR_FENCE_W_IP)) {
> >> >+ /*
> >> >+ * The head pointer is not updated by the hardware, it
> >> >+ * still points to the index of illegal command
> >> >+ */
> >> >+ riscv_iommu_readl_timeout(queue->iommu, Q_HEAD(queue), head,
> >> >+ !(head & ~queue->mask), 0,
> >> >+ RISCV_IOMMU_QUEUE_TIMEOUT);
> >> >+
> >> >+ if (ctrl & RISCV_IOMMU_CQCSR_CMD_ILL) {
> >> >+ /*
> >> >+ * Use a dummy IOFENCE instead of the illegal command
> >> >+ * to prevent hardware lockup
> >> >+ */
> >>
> >> The RISC-V IOMMU spec 1.0 (Section 5.15, cqcsr) states:
> >>
> >> "If software makes the CQ operational again after a cmd_ill or
> >> cqmf error, then software should resubmit the commands submitted
> >> since the last IOFENCE.C that successfully completed."
> >>
> >> So it seems that simply replacing the illegal command and letting the
> >> queue continue is not sufficient.
> >>
> >
> >Yes, this patch doesn't completly follow what the spec mentioned.
> >Initially, I wanted to implement the method suggested by the spec.
> >However, there are some synchronization challenges based on current
> >driver implementation. Therefore, I chose an approach similar to other
> >IOMMU drivers (like SMMU and Intel VT-d). The main goal is to at least
> >prevent the hardware from getting stuck and crashing the entire
> >system.
>
> Thanks for the detailed explanation. I understand the practical
> challenges you described. However, I’d like to note that the ARM
> SMMUv3 and Intel VT-d specs appear to provide somewhat different
> guarantees from the RISC-V IOMMU spec with regard to error recovery.
>
> ARM SMMUv3 (IHI 0070, Section 7.1):
> "Older commands are unaffected by a later Command queue error."
>
> Intel VT-d (Section 6.5.2.10):
> "Any invalidation commands ahead of the invalid descriptor that
> are already fetched and pending in hardware at the time of
> detecting the invalid descriptor error are completed by
> hardware as normal."
>

Thanks for your elaboration, it is important for discussion here.
Although SMMU and Intel VT-d can protect the commands sent before an
illegal one, the original task of the illegal command is still lost.
This might still lead to unexpected hardware behavior

> >
> >Here are the main issues we would need to solve if we fully follow the spec:
> >
> >1. IOFENCE index shift: After an illegal command, if another thread is
> >waiting for subsequent IOFENCE to finish, re-submitting commands will
> >change that IOFENCE's index in the queue. This means the waiting
> >thread in 'riscv_iommu_cmd_sync' might finish too early because the
> >'prod' value should be changed as well. We could fix this by making
> >IOFENCE write a sequence number to a specific address, and having the
> >thread wait for that data instead.
> >
> >2. Timeout errors: If an illegal command happens while another thread
> >is trying to write a command, that thread might be waiting for the
> >tail to move (in 'riscv_iommu_queue_send'), and exit the wait due to a
> >timeout. This leads to errors in the caller subsystem (like DMA). So
> >it seems even if the resubmit finishes later, it might not help much.
> >
> >3. Queue tail mismatch: Similar to point 2 situation, a thread waiting
> >in 'riscv_iommu_queue_send' expects prod == queue->tail. If we
> >re-submit commands quickly, queue->tail is updated asynchronously to a
> >farther value. The waiting thread might never see the condition met
> >and time out.
> >
> >4. Shadow queue overhead: Inside the threading IRQ handler, we cannot
> >easily know what the illegal command was just by checking the current
> >command queue. We would need to create a "shadow command queue" to
> >keep a history. This would break the current driver design. We would
> >also need to add locks to prevent race conditions on shadow command
> >queue, which would reduce the driver's performance.
> >
> >Considering these trade-offs, I prefer not to make the driver much
> >more complex and slower just to handle rare hardware errors.
> >Personally, I lean towards treating this hardware fault as a fatal
> >error without trying to recover it in software. However, I understand
> >this might be too extreme and would require a hardware reset.
> >Therefore, this patch might be a good middle ground. It prevents the
> >hardware lockup. Even though we might lose some commands and cause
> >incorrect results for the user, it at least keeps the system alive and
> >gives the user a chance to retry their operation again.
> >
>
> I understand the trade-off. Given the complexity of full spec
> compliance, this simplified version may not actually address the
> underlying issue very well, so I’d suggest making its limitations
> explicit. Specifically, could you document in the commit message or
> code comments which scenarios it can handle and which ones may
> still remain problematic?
>

Ok, I will add more details to the code comments and the commit
message. As we discussed, this method is a trade-off and doesn't
completely fix the problem. Whether to accept this patch might still
depend on the maintainer's decision. Thank you very much for your
review and the helpful discussion

> >
> >> Thanks,
> >> Fangyu
> >>
> >> >+ memset(&cmd, 0, sizeof(cmd));
> >> >+ cmd.dword0 = FIELD_PREP(RISCV_IOMMU_CMD0_OPCODE,
> >> >+ RISCV_IOMMU_CMD_IOFENCE_OPCODE);
> >> >+ memcpy(queue->base + head * sizeof(cmd), &cmd, sizeof(cmd));
> >> >+ dma_wmb();
> >> >+ }
> >> >+
> >> > riscv_iommu_writel(queue->iommu, queue->qcr, ctrl);
> >> >+
> >> > dev_warn(queue->iommu->dev,
> >> > "Queue #%u error; fault:%d timeout:%d illegal:%d fence_w_ip:%d\n",
> >> > queue->qid,
> >> >--
> >> >2.43.7