Re: [PATCH]: PCI: GART iommu alignment fixes [v2]

From: Prarit Bhargava
Date: Thu Aug 07 2008 - 13:43:26 EST




Jesse Barnes wrote:
On Wednesday, August 6, 2008 7:32 am Prarit Bhargava wrote:
You can't kmalloc pci_dev or setup some trivial values. You need to
use a proper value. The pci code does for us.
Oops -- I meant struct device, not struct pci_dev.

Anwyay, Jesse -- is this true? I can no longer do something like:


static struct device junk_dev = {
.bus_id = "junk device",
.coherent_dma_mask = 0xffffffff,
.dma_mask = &junk_dev.coherent_dma_mask,
};

And then use that as the device target for dma_alloc_coherent? AFAIK,
that has always worked for me.

It gets dangerous since platforms are in control of some pci_dev and dev fields, and if they don't get initialized you can get into trouble.

True, but dma_alloc_coherent also allows for a NULL dev pointer, and uses a dummy struct dev (fallback_device). So it should be callable without any dev struct pointer.

In that case, I hit the BUG() check warning in iommu_is_span_boundary() because boundary_size was calculated as (unsigned long) 0xffffffff + 1 = 0.

That's why we must cast to "unsigned long long".

ie) it is possible to hit this BUG() right now.

Calgary IOMMU has the same code. New AMD IOMMU has the same code too.
Then they don't handle the above problem and are broken when
dma_get_seg_boundary() returns 0xffffffff and will require patches.

/me hasn't tried out Calgary of AMD IOMMU.

Would be good to find someone to do some testing on one of those platforms...

I've pinged someone at AMD to see if I can get my hands on a system (or if to see if there is a system available locally).

As for Calgary, I'm looking into it ATM. I think I can get my hands on one.

If I find the problem on those platforms I'll ping the maintainers and post separate patches. ATM I'm much more concerned about GART.

Maybe I'm missing something -- what implies size has to be a power of
two?
Yes, see iommu_area_alloc().
/me looks and still doesn't see where the size passed into
gart_map_simple() must be a power of two. ... and if that was the case,
shouldn't we be failing all the time? I mean, I've seen values passed
into pci_alloc_consistent like 0x3820 -- clearly not a multiple of 2.

iommu_area_alloc() deals with pages, not actual sizes as
gart_map_simple() does.

Tomonori-san, I think I understand where your confusion may lie. The size argument in the iommu-helper.c code is NOT the same size in dma_alloc_coherent() and gart_map_simple(). In iommu-helper.c the size is the # of pages, and the in the exported function calls it is an actual size. Is that what is confusing you?

If anything, I would make this simple fix:

dma_addr_t map = dma_map_area(dev, paddr, size, dir, size - 1);

should be

dma_addr_t map = dma_map_area(dev, paddr, size, dir, size);

because after my patch we round up the mask argument to get the correct
alignment to # of pages anyway.

Feel like respinning with a full changelog against my for-linus branch? Maybe you can convince Tomonori-san this time. :)


I no longer think the above suggested change is necessary. AFAICT, the code is doing exactly the right thing. "size-1" is correct.

P.

Jesse
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/