Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api
From: Jani Nikula
Date: Mon Dec 23 2019 - 06:41:48 EST
On Mon, 23 Dec 2019, Robin Murphy <robin.murphy@xxxxxxx> wrote:
> On 2019-12-23 10:37 am, Jani Nikula wrote:
>> On Sat, 21 Dec 2019, Tom Murphy <murphyt7@xxxxxx> wrote:
>>> This patchset converts the intel iommu driver to the dma-iommu api.
>>>
>>> While converting the driver I exposed a bug in the intel i915 driver
>>> which causes a huge amount of artifacts on the screen of my
>>> laptop. You can see a picture of it here:
>>> https://github.com/pippy360/kernelPatches/blob/master/IMG_20191219_225922.jpg
>>>
>>> This issue is most likely in the i915 driver and is most likely caused
>>> by the driver not respecting the return value of the
>>> dma_map_ops::map_sg function. You can see the driver ignoring the
>>> return value here:
>>> https://github.com/torvalds/linux/blob/7e0165b2f1a912a06e381e91f0f4e495f4ac3736/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c#L51
>>>
>>> Previously this didnât cause issues because the intel map_sg always
>>> returned the same number of elements as the input scatter gather list
>>> but with the change to this dma-iommu api this is no longer the
>>> case. I wasnât able to track the bug down to a specific line of code
>>> unfortunately.
>>>
>>> Could someone from the intel team look at this?
>>
>> Let me get this straight. There is current API that on success always
>> returns the same number of elements as the input scatter gather
>> list. You propose to change the API so that this is no longer the case?
>
> No, the API for dma_map_sg() has always been that it may return fewer
> DMA segments than nents - see Documentation/DMA-API.txt (and otherwise,
> the return value would surely be a simple success/fail condition).
> Relying on a particular implementation behaviour has never been strictly
> correct, even if it does happen to be a very common behaviour.
>
>> A quick check of various dma_map_sg() calls in the kernel seems to
>> indicate checking for 0 for errors and then ignoring the non-zero return
>> is a common pattern. Are you sure it's okay to make the change you're
>> proposing?
>
> Various code uses tricks like just iterating the mapped list until the
> first segment with zero sg_dma_len(). Others may well simply have bugs.
Thanks for the clarification.
BR,
Jani.
>
> Robin.
>
>> Anyway, due to the time of year and all, I'd like to ask you to file a
>> bug against i915 at [1] so this is not forgotten, and please let's not
>> merge the changes before this is resolved.
>>
>>
>> Thanks,
>> Jani.
>>
>>
>> [1] https://gitlab.freedesktop.org/drm/intel/issues/new
>>
>>
--
Jani Nikula, Intel Open Source Graphics Center