RE: [PATCH rc v3] iommufd/fault: Use a separate spinlock to protect fault->deliver list

From: Tian, Kevin
Date: Mon Jan 20 2025 - 04:23:44 EST

Next message: Sergio Paracuellos: "[PATCH v3 5/6] mips: dts: ralink: mt7620a: update system controller node and its consumers"
Previous message: Sergio Paracuellos: "[PATCH v3 4/6] mips: dts: ralink: rt3883: update system controller node and its consumers"
In reply to: Nicolin Chen: "Re: [PATCH rc v3] iommufd/fault: Use a separate spinlock to protect fault->deliver list"
Next in thread: Jason Gunthorpe: "Re: [PATCH rc v3] iommufd/fault: Use a separate spinlock to protect fault->deliver list"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> Sent: Saturday, January 18, 2025 2:49 AM
>
> On Fri, Jan 17, 2025 at 10:38:56AM -0400, Jason Gunthorpe wrote:
> > On Fri, Jan 17, 2025 at 06:20:15AM +0000, Tian, Kevin wrote:
> > > > From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> > > > Sent: Friday, January 17, 2025 10:05 AM
> > > >
> > > > mutex_lock(&fault->mutex);
> > >
> > > Nit. The scope of above can be reduced too, by guarding only the
> > > lines for fault->response.
> >
> > Hmm, I think you have found a flaw unfortunately..
> >
> > iommufd_auto_response_faults() is called async to all of this if a
> > device is removed. It should clean out that device from all the fault
> > machinery.
> >
> > With the new locking we don't hold the mutex across the list
> > manipulation in read so there is a window where a fault can be on the
> > stack in iommufd_fault_fops_read() but not in the fault->response or
> > the deliver list.
> >
> > Thus it will be missed during cleanup.
> >
> > I think because of the cleanup we have to continue to hold the mutex
> > across all of fops_read and this patch is just adding an additional
> > spinlock around the deliver list to isolate it from the
> > copy_to_user().
> >
> > Is that right Nicolin?
>
> Yes. I've missed that too..
>
> A group can be read out of the deliver list in fops_read() prior
> to auto_response_faults() taking the mutex, then its following
> xa_alloc() will add to the response list that fetched group, and
> it will stay in the xarray until iommufd_fault_destroy() flushes
> everything away.
>
> It might not be a bug to the existing flow (?), but doesn't seem
> to worth touching the mutex in this patch.
>
> Let me send a v4 changing that mutex back.
>

While looking at the related logic I wonder whether there is another
potential issue in smmu side.

Based on discussion with Baolu currently iommu_detach_group_handle()
plays the role of quiescing fault reporting from iommu core. Once it's
completed there won't be any new event queued to fault->deliver then
it's safe to call iommufd_auto_response_faults() to handle those already
queued events in iommufd.

Intel-iommu driver calls intel_iommu_drain_pasid_prq() as the last step
in detach which scans the prq queue and waits for all pending requests
related to the device to be delivered (by comparing head/tail pointer).
There is a bug (just fixed by Baolu [1]) but the logic is there.

But I didn't see the similar logic in arm_smmu_attach_dev_blocked().
And arm_smmu_evtq_thread() just removes an entry from the cmd
queue and calls arm_smmu_handle_evt() to report. Seems no track of
when those events are delivered.

Detach may happen after iommu_report_device_fault() acquires old
domain/handle/handler/etc. but before iommufd_fault_iopf_handler()
actually queues an event to fault->deliver. Then UAF.

Did I overlook anything here?

Thanks
Kevin

[1] https://lore.kernel.org/all/20250120080144.810455-1-baolu.lu@xxxxxxxxxxxxxxx/

Next message: Sergio Paracuellos: "[PATCH v3 5/6] mips: dts: ralink: mt7620a: update system controller node and its consumers"
Previous message: Sergio Paracuellos: "[PATCH v3 4/6] mips: dts: ralink: rt3883: update system controller node and its consumers"
In reply to: Nicolin Chen: "Re: [PATCH rc v3] iommufd/fault: Use a separate spinlock to protect fault->deliver list"
Next in thread: Jason Gunthorpe: "Re: [PATCH rc v3] iommufd/fault: Use a separate spinlock to protect fault->deliver list"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]