Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api

From: Robin Murphy
Date: Mon Dec 23 2019 - 06:29:38 EST


On 2019-12-23 10:37 am, Jani Nikula wrote:
On Sat, 21 Dec 2019, Tom Murphy <murphyt7@xxxxxx> wrote:
This patchset converts the intel iommu driver to the dma-iommu api.

While converting the driver I exposed a bug in the intel i915 driver
which causes a huge amount of artifacts on the screen of my
laptop. You can see a picture of it here:
https://github.com/pippy360/kernelPatches/blob/master/IMG_20191219_225922.jpg

This issue is most likely in the i915 driver and is most likely caused
by the driver not respecting the return value of the
dma_map_ops::map_sg function. You can see the driver ignoring the
return value here:
https://github.com/torvalds/linux/blob/7e0165b2f1a912a06e381e91f0f4e495f4ac3736/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c#L51

Previously this didnât cause issues because the intel map_sg always
returned the same number of elements as the input scatter gather list
but with the change to this dma-iommu api this is no longer the
case. I wasnât able to track the bug down to a specific line of code
unfortunately.

Could someone from the intel team look at this?

Let me get this straight. There is current API that on success always
returns the same number of elements as the input scatter gather
list. You propose to change the API so that this is no longer the case?

No, the API for dma_map_sg() has always been that it may return fewer DMA segments than nents - see Documentation/DMA-API.txt (and otherwise, the return value would surely be a simple success/fail condition). Relying on a particular implementation behaviour has never been strictly correct, even if it does happen to be a very common behaviour.

A quick check of various dma_map_sg() calls in the kernel seems to
indicate checking for 0 for errors and then ignoring the non-zero return
is a common pattern. Are you sure it's okay to make the change you're
proposing?

Various code uses tricks like just iterating the mapped list until the first segment with zero sg_dma_len(). Others may well simply have bugs.

Robin.

Anyway, due to the time of year and all, I'd like to ask you to file a
bug against i915 at [1] so this is not forgotten, and please let's not
merge the changes before this is resolved.


Thanks,
Jani.


[1] https://gitlab.freedesktop.org/drm/intel/issues/new