RE: intel-iommu.c bug

From: David Woodhouse
Date: Fri Dec 17 2010 - 13:45:50 EST


On Fri, 2010-12-17 at 10:28 -0800, Suresh Siddha wrote:
> > >-----Original Message-----
> > >From: Cliff Wickman [mailto:cpw@xxxxxxx]
> > >Sent: Friday, December 17, 2010 7:03 AM
> > >To: linux-kernel@xxxxxxxxxxxxxxx
> > >Cc: Raj, Ashok; Li, Shaohua; Keshavamurthy, Anil S; Yu, Fenghua
> > >Subject: intel-iommu.c bug
> > >
> > >
> > >
> > >This bug was reported by Mike Habeck <habeck@xxxxxxx>.
> > >The test system was an SGI Altix UV. These are Mike's words:
> > >
> > > It appears there is a bug in the iommu code that when 'forcedac' isn't used
> > > the nvidia driver is handed back a 44bit dma address even though it's
> > > dma_mask is set to 40bits.
> > >
> > > I added some debug to the intel_iommu code and I see:
> > > intel_map_sg(): dma_addr_t=0xf81fffff000, pdev->dma_mask=0xffffffffff
> > >
> > > Note the dma_addr_t being handed back is 44bits even though the mask is 40bits.
> > > This results in the nvidia card generating a bad dma (i.e. the nvidia hw is
> > > only capable of generating a 40bit dma address so the upper 4 bits are lost
> > > and that results in the iommu hw detecting a bad dma access):
> > >
> > > DRHD: handling fault status reg 2
> > > DMAR:[DMA Read] Request device [36:00.0] fault addr 81fffff000
> > > DMAR:[fault reason 06] PTE Read access is not set
> > >
> > > If I boot with 'forcedac' then the dma mask is honored and the dma_addr_t
> > > handed back is 40bits:
> > >
> > > intel_map_sg(): dma_addr_t=0xfffffff000, pdev->dma_mask=0xffffffffff
> > >
> > > Without forcedac you'd expect these early maps being handed back to be 32bits.
> > > This is the first debug printf (so the first mapping the nvidia device has
> > > requested) so I'd expect it to be 0xfffff000... interesting that is what the
> > > lower 32bits are in this address being handed back... that 0xf81 upper bits
> > > appear to be garbage bits. This might be a hint to help find the bug...

Hm, you're right; it should be returning addresses under 4GiB until that
space is exhausted and it has to use higher addresses, unless you pass
'forcedac' on the command line.

Please could you instrument alloc_iova (in drivers/pci/iova.c to print
its arguments, and also pfn_hi and pfn_lo of the iova it returns?

If those are sane, can you print start_vpfn at about line 2909 of
intel_iommu.c? And if *that* looks sane, print iov_pfn and the new value
of sg->dma_address each time that's set, at round line 1674.

Is this is 32-bit or 64-bit kernel?

--
dwmw2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/