RE: [PATCH rc v3] iommufd/fault: Use a separate spinlock to protect fault->deliver list
From: Tian, Kevin
Date: Mon Jan 20 2025 - 04:23:44 EST
> From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> Sent: Saturday, January 18, 2025 2:49 AM
>
> On Fri, Jan 17, 2025 at 10:38:56AM -0400, Jason Gunthorpe wrote:
> > On Fri, Jan 17, 2025 at 06:20:15AM +0000, Tian, Kevin wrote:
> > > > From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> > > > Sent: Friday, January 17, 2025 10:05 AM
> > > >
> > > > mutex_lock(&fault->mutex);
> > >
> > > Nit. The scope of above can be reduced too, by guarding only the
> > > lines for fault->response.
> >
> > Hmm, I think you have found a flaw unfortunately..
> >
> > iommufd_auto_response_faults() is called async to all of this if a
> > device is removed. It should clean out that device from all the fault
> > machinery.
> >
> > With the new locking we don't hold the mutex across the list
> > manipulation in read so there is a window where a fault can be on the
> > stack in iommufd_fault_fops_read() but not in the fault->response or
> > the deliver list.
> >
> > Thus it will be missed during cleanup.
> >
> > I think because of the cleanup we have to continue to hold the mutex
> > across all of fops_read and this patch is just adding an additional
> > spinlock around the deliver list to isolate it from the
> > copy_to_user().
> >
> > Is that right Nicolin?
>
> Yes. I've missed that too..
>
> A group can be read out of the deliver list in fops_read() prior
> to auto_response_faults() taking the mutex, then its following
> xa_alloc() will add to the response list that fetched group, and
> it will stay in the xarray until iommufd_fault_destroy() flushes
> everything away.
>
> It might not be a bug to the existing flow (?), but doesn't seem
> to worth touching the mutex in this patch.
>
> Let me send a v4 changing that mutex back.
>
While looking at the related logic I wonder whether there is another
potential issue in smmu side.
Based on discussion with Baolu currently iommu_detach_group_handle()
plays the role of quiescing fault reporting from iommu core. Once it's
completed there won't be any new event queued to fault->deliver then
it's safe to call iommufd_auto_response_faults() to handle those already
queued events in iommufd.
Intel-iommu driver calls intel_iommu_drain_pasid_prq() as the last step
in detach which scans the prq queue and waits for all pending requests
related to the device to be delivered (by comparing head/tail pointer).
There is a bug (just fixed by Baolu [1]) but the logic is there.
But I didn't see the similar logic in arm_smmu_attach_dev_blocked().
And arm_smmu_evtq_thread() just removes an entry from the cmd
queue and calls arm_smmu_handle_evt() to report. Seems no track of
when those events are delivered.
Detach may happen after iommu_report_device_fault() acquires old
domain/handle/handler/etc. but before iommufd_fault_iopf_handler()
actually queues an event to fault->deliver. Then UAF.
Did I overlook anything here?
Thanks
Kevin
[1] https://lore.kernel.org/all/20250120080144.810455-1-baolu.lu@xxxxxxxxxxxxxxx/