Re: [PATCH v4 03/22] iommu: introduce device fault report API
From: Jean-Philippe Brucker
Date: Thu Mar 07 2019 - 06:43:26 EST
On 06/03/2019 23:46, Jacob Pan wrote:
> On Tue, 5 Mar 2019 15:03:41 +0000
> Jean-Philippe Brucker <jean-philippe.brucker@xxxxxxx> wrote:
>
>> On 18/02/2019 13:54, Eric Auger wrote:
>> [...]> +/**
>> > + * iommu_register_device_fault_handler() - Register a device fault
>> > handler
>> > + * @dev: the device
>> > + * @handler: the fault handler
>> > + * @data: private data passed as argument to the handler
>> > + *
>> > + * When an IOMMU fault event is received, call this handler with
>> > the fault event
>> > + * and data as argument. The handler should return 0 on success.
>> > If the fault is
>> > + * recoverable (IOMMU_FAULT_PAGE_REQ), the handler can also
>> > complete
>> > + * the fault by calling iommu_page_response() with one of the
>> > following
>> > + * response code:
>> > + * - IOMMU_PAGE_RESP_SUCCESS: retry the translation
>> > + * - IOMMU_PAGE_RESP_INVALID: terminate the fault
>> > + * - IOMMU_PAGE_RESP_FAILURE: terminate the fault and stop
>> > reporting
>> > + * page faults if possible.
>>
>> The comment refers to function and values that haven't been defined
>> yet. Either the page_response() patch should come before, or we need
>> to split this patch.
>>
>> Something I missed before: if the handler fails (returns != 0) it
>> should complete the fault by calling iommu_page_response(), if we're
>> not doing it in iommu_report_device_fault(). It should be indicated
>> in this comment. It's safe for the handler to call page_response()
>> since we're not holding fault_param->lock when calling the handler.
>>
> If the page request fault is to be reported to a guest, the report
> function cannot wait for the completion status. As long as the fault is
> injected into the guest, the handler should complete with success. If
> the PRQ report fails, IMHO, the caller of iommu_report_device_fault()
> should send page_response, perhaps after clean up all partial response
> of the group too.
Ok, the caller (IOMMU driver) sending the page_response if
iommu_report_device_fault() fails does make more sense. Agreed on the
partial cleanup as well, we don't keep track of them here, but I need to
add that to the io-pgfault layer. However some cleanup should probably
happen in here...
>> > + /* we only report device fault if there is a handler
>> > registered */
>> > + mutex_lock(&dev->iommu_param->lock);
>> > + if (!dev->iommu_param->fault_param ||
>> > + !dev->iommu_param->fault_param->handler) {
>> > + ret = -EINVAL;
>> > + goto done_unlock;
>> > + }
>> > + fparam = dev->iommu_param->fault_param;
>> > + if (evt->fault.type == IOMMU_FAULT_PAGE_REQ &&
>> > + evt->fault.prm.flags &
>> > IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE) {
>> > + evt_pending = kmemdup(evt, sizeof(struct
>> > iommu_fault_event),
>> > + GFP_KERNEL);
>> > + if (!evt_pending) {
>> > + ret = -ENOMEM;
>> > + goto done_unlock;
>> > + }
>> > + mutex_lock(&fparam->lock);
>> > + list_add_tail(&evt_pending->list, &fparam->faults);
>> > + mutex_unlock(&fparam->lock);
>> > + }
>> > + ret = fparam->handler(evt, fparam->data);
... if ret != 0, removing and freeing the pending event seems more
appropriate here than asking our caller to do it
Thanks,
Jean
>> > +done_unlock:
>> > + mutex_unlock(&dev->iommu_param->lock);
>> > + return ret;
>> > +}
>> > +EXPORT_SYMBOL_GPL(iommu_report_device_fault);
>> [...]
>
> [Jacob Pan]