Re: PCI-DMA: Out of IOMMU space on x86-64 (Athlon64x2), with solution

From: Jens Axboe
Date: Thu Mar 02 2006 - 04:57:58 EST


On Thu, Mar 02 2006, Andi Kleen wrote:
> On Thursday 02 March 2006 00:23, Michael Monnerie wrote:
> > Hello, I use SUSE 10.0 with all updates and actual kernel 2.6.13-15.8 as
> > provided from SUSE (just self compiled to optimize for Athlon64, SMP,
> > and HZ=100), with an Asus A8N-E motherboard, and an Athlon64x2 CPU.
> > This host is used with VMware GSX server running 6 Linux client and one
> > Windows client host. There's a SW-RAID1 using 2 SATA HDs.
>
> Nvidia hardware SATA cannot directly DMA to > 4GB, so it
> has to go through the IOMMU. And in that kernel the Nforce
> ethernet driver also didn't do >4GB access, although the ethernet HW
> is theoretically capable.
>
> Maybe VMware pins unusually much IO memory in flight (e.g. by using
> a lot of O_DIRECT). That could potentially cause the IOMMU to fill up.
> The RAID-1 probably also makes it worse because it will double the IO
> mapping requirements.
>
> Or you have a leak in some driver, but if the problem goes away
> after enlarging the IOMMU that's unlikely.
>
> What would probably help is to get a new SATA controller that can
> access >4GB natively and at some point update to a newer kernel
> with newer forcedeth driver. Or just run with the enlarged IOMMU.

libata should also handle this case better. Usually we just need to
defer command handling if the dma_map_sg() fails. Changing
ata_qc_issue() to return nsegments for success, 0 for defer failure, and
-1 for permanent failure should be enough. The SCSI path is easy at
least, as we can just ask for a defer there. The internal qc_issue is a
little more tricky.

The NCQ patches have logic to handle this, although for other reasons
(to avoid overlap between NCQ and non-NCQ commands). It could easily be
reused for this as well.

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/