Re: [RFC v3 14/21] iommu: introduce device fault data

From: Jean-Philippe Brucker
Date: Wed Jan 16 2019 - 11:55:00 EST


On 15/01/2019 21:27, Auger Eric wrote:
[...]
>>>> /* iommu fault flags */
>>>> -#define IOMMU_FAULT_READ 0x0
>>>> -#define IOMMU_FAULT_WRITE 0x1
>>>> +#define IOMMU_FAULT_READ (1 << 0)
>>>> +#define IOMMU_FAULT_WRITE (1 << 1)
>>>> +#define IOMMU_FAULT_EXEC (1 << 2)
>>>> +#define IOMMU_FAULT_PRIV (1 << 3)
>>>>
>>>> typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
>>>> struct device *, unsigned long, int, void *);
>>>> +typedef int (*iommu_dev_fault_handler_t)(struct iommu_fault_event *,
>>>> void *);
>>>> struct iommu_domain_geometry {
>>>> dma_addr_t aperture_start; /* First address that can be
>>>> mapped */ @@ -255,6 +259,52 @@ struct iommu_device {
>>>> struct device *dev;
>>>> };
>>>>
>>>> +/**
>>>> + * struct iommu_fault_event - Generic per device fault data
>>>> + *
>>>> + * - PCI and non-PCI devices
>>>> + * - Recoverable faults (e.g. page request), information based on
>>>> PCI ATS
>>>> + * and PASID spec.
>>>> + * - Un-recoverable faults of device interest
>>>> + * - DMA remapping and IRQ remapping faults
>>>> + *
>>>> + * @fault: fault descriptor
>>>> + * @device_private: if present, uniquely identify device-specific
>>>> + * private data for an individual page request.
>>>> + * @iommu_private: used by the IOMMU driver for storing
>>>> fault-specific
>>>> + * data. Users should not modify this field before
>>>> + * sending the fault response.
>>>> + */
>>>> +struct iommu_fault_event {
>>>> + struct iommu_fault fault;
>>>> + u64 device_private;
>>> I think we want to move device_private to uapi since it gets injected
>>> into the guest, then returned by guest in case of page response. For
>>> VT-d we also need 128 bits of private data. VT-d spec. 7.7.1
>>
>> Ah, I didn't notice the format changed in VT-d rev3. On that topic, how
>> do we manage future extensions to the iommu_fault struct? Should we add
>> ~48 bytes of padding after device_private, along with some flags telling
>> which field is valid, or deal with it using a structure version like we
>> do for the invalidate and bind structs? In the first case, iommu_fault
>> wouldn't fit in a 64-byte cacheline anymore, but I'm not sure we care.
>>
>>> For exception tracking (e.g. unanswered page request), I can add timer
>>> and list info later when I include PRQ. sounds ok?
>>>> + u64 iommu_private;
>> [...]
>>>> +/**
>>>> + * struct iommu_fault - Generic fault data
>>>> + *
>>>> + * @type contains fault type
>>>> + * @reason fault reasons if relevant outside IOMMU driver.
>>>> + * IOMMU driver internal faults are not reported.
>>>> + * @addr: tells the offending page address
>>>> + * @fetch_addr: tells the address that caused an abort, if any
>>>> + * @pasid: contains process address space ID, used in shared virtual
>>>> memory
>>>> + * @page_req_group_id: page request group index
>>>> + * @last_req: last request in a page request group
>>>> + * @pasid_valid: indicates if the PRQ has a valid PASID
>>>> + * @prot: page access protection flag:
>>>> + * IOMMU_FAULT_READ, IOMMU_FAULT_WRITE
>>>> + */
>>>> +
>>>> +struct iommu_fault {
>>>> + __u32 type; /* enum iommu_fault_type */
>>>> + __u32 reason; /* enum iommu_fault_reason */
>>>> + __u64 addr;
>>>> + __u64 fetch_addr;
>>>> + __u32 pasid;
>>>> + __u32 page_req_group_id;
>>>> + __u32 last_req;
>>>> + __u32 pasid_valid;
>>>> + __u32 prot;
>>>> + __u32 access;
>>
>> What does @access contain? Can it be squashed into @prot?
> it was related to F_ACCESS event record and was a placeholder for
> reporting access attributes of the input transaction (Rnw, InD, PnU
> fields). But I wonder whether this is needed to implement such fine
> level fault reporting. Do we really care?

I think we do, to properly inject PRI/Stall later. But RnW, InD and PnU
can already be described with the IOMMU_FAULT_* flags defined above.
We're missing CLASS and S2, which could also be useful for debugging.
CLASS is specific to SMMUv3 but could probably be represented with
@reason. For S2, we could keep printing stage-2 faults in the driver,
and not report them to userspace.

Thanks,
Jean