RE: [PATCH v8 06/10] iommufd: Add iommufd fault object
From: Tian, Kevin
Date: Thu Jul 04 2024 - 02:38:18 EST
> From: Nicolin Chen <nicolinc@xxxxxxxxxx>
> Sent: Thursday, July 4, 2024 1:36 PM
>
> On Thu, Jul 04, 2024 at 10:59:45AM +0800, Baolu Lu wrote:
> > > On Tue, Jul 02, 2024 at 02:34:40PM +0800, Lu Baolu wrote:
> > >
> > > +enum iommu_fault_type {
> > > + IOMMU_FAULT_TYPE_HWPT_IOPF,
> > > + IOMMU_FAULT_TYPE_VIOMMU_IRQ,
> > > +};
> > >
> > > struct iommu_fault_alloc {
> > > __u32 size;
> > > __u32 flags;
> > > + __u32 type; /* enum iommu_fault_type */
> > > __u32 out_fault_id;
> > > __u32 out_fault_fd;
and need a new reserved field for alignment.
> > > };
> > >
> > > I understand that this is already v8. So, maybe we can, for now,
> > > apply the small diff above with an IOMMU_FAULT_TYPE_HWPT_IOPF
> type
> > > check in the ioctl handler. And a decoupling for the iopf fops in
> > > the ioctl handler can come later in the viommu series:
> > > switch (type) {
> > > case IOMMU_FAULT_TYPE_HWPT_IOPF:
> > > filep = anon_inode_getfile("[iommufd-pgfault]",
> > > &iommufd_fault_fops_iopf);
> > > case IOMMU_FAULT_TYPE_VIOMMU_IRQ:
> > > filep = anon_inode_getfile("[iommufd-viommu-irq]",
> > > &iommufd_fault_fops_viommu);
> > > default:
> > > return -EOPNOSUPP;
> > > }
> > >
> > > Since you are the designer here, I think you have a better 10000
> > > foot view -- maybe I am missing something here implying that the
> > > fault object can't be really reused by viommu.
> > >
> > > Would you mind sharing some thoughts here?
> >
> > I think this is a choice between "two different objects" vs. "same
> > object with different FD interfaces". If I understand it correctly, your
> > proposal of unrecoverable fault delivery is not limited to vcmdq, but
> > generic to all unrecoverable events that userspace should be aware of
> > when the passed-through device is affected.
>
> It's basically IRQ forwarding, not confined to unrecoverable
> faults. For example, a VCMDQ used by the guest kernel would
> raise an HW IRQ if the guest kernel issues an illegal command
> to the HW Queue assigned to it. The host kernel will receive
> the IRQ, so it needs a way to forward it to the VM for guest
> kernel to recover the HW queue.
>
> The way that we define the structure can follow what we have
> for hwpt_alloc/invalidate uAPIs, i.e. driver data/event. And
> such an event can carry unrecoverable translation faults too.
> SMMU at least reports DMA translation faults using an eventQ
> in its own native language.
>
> > From a hardware architecture perspective, the interfaces for
> > unrecoverable events don't always match the page faults. For example,
> > VT-d architecture defines a PR queue for page faults, but uses a
> > register set to report unrecoverable events. The 'reason', 'request id'
> > and 'pasid' fields of the register set indicate what happened on the
> > hardware. New unrecoverable events will not be reported until the
> > previous one has been fetched.
>
> Understood. I don't think we can share the majority pieces in
> the fault.c. Just the "IOMMU_FAULT_QUEUE_ALLOC" ioctl itself
> looks way too general to be limited to page-fault usage only.
> So, I feel we can share, for example:
> IOMMU_FAULT_QUEUE_ALLOC (type=hwpt_iopf) -> fault_id=1
> IOMMU_HWPT_ALLOC (fault_id=1) -> hwpt_id=2
> IOMMU_FAULT_QUEUE_ALLOC (type=viommu_irq) -> fault_id=3
> IOMMU_VIOMMU_ALLOC (fault_id=2) -> viommu_id=4
> The handler will direct to different fops as I drafted in my
> previous mail.
>
> > With the above being said, I have no strong opinions between these two
> > choices. Jason and Kevin should have more insights.
>
> Thanks. Jason is out of office this week, so hopefully Kevin
> may shed some light. I personally feel that we don't need to
> largely update this series until we add VIOMMU. Yet, it would
> be convenient if we add a "type" in the uAPI with this series.
>
This is ok to me.