RE: [bug report] iommu/arm-smmu-v3: Event cannot be printed in some scenarios
From: Tian, Kevin
Date: Thu Jul 25 2024 - 20:04:38 EST
> From: Jason Gunthorpe <jgg@xxxxxxxx>
> Sent: Thursday, July 25, 2024 8:59 PM
>
> On Thu, Jul 25, 2024 at 07:35:00AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@xxxxxxxx>
> > > Sent: Wednesday, July 24, 2024 9:03 PM
> > > + * and the fault remains owned by the caller. The caller should log the
> DMA
> > > + * protection failure and resolve the fault. Otherwise on success the fault
> is
> > > + * always completed eventually.
> >
> > About "resolve the fault", I didn't find such logic from smmu side in
> > arm_smmu_evtq_thread(). It just logs the event. Is it asking for new
> > change in smmu driver or reflecting the current fact which if missing
> > leads to the said stall problem?
>
> It was removed in b554e396e51c ("iommu: Make iopf_group_response()
> return void")
>
> ret = iommu_report_device_fault(master->dev, &fault_evt);
> - if (ret && flt->type == IOMMU_FAULT_PAGE_REQ) {
> - /* Nobody cared, abort the access */
> - struct iommu_page_response resp = {
> - .pasid = flt->prm.pasid,
> - .grpid = flt->prm.grpid,
> - .code = IOMMU_PAGE_RESP_FAILURE,
> - };
> - arm_smmu_page_response(master->dev, &fault_evt, &resp);
> - }
> -
>
> Part of the observation going into b554e396e51c was that all drivers
> have something like the above, and we can pull it into the core code.
>
> So perhaps we should still always abort the request from
> iommu_report_device_fault() instead of requiring boilerplate like
> above in drivers. That does some better.
>
> The return code only indicates if the event should be logged.
>
Yes, this makes more sense. Otherwise we need also pull back
those removed lines in drivers for fault resolving.