Re: Problems with alpha/pci + radeon/ttm

From: Michael Cree
Date: Thu Jun 24 2010 - 05:51:46 EST

On 22/06/10 20:32, Dave Airlie wrote:
On Tue, Jun 22, 2010 at 3:59 PM, FUJITA Tomonori
<fujita.tomonori@xxxxxxxxxxxxx> wrote:
On Mon, 21 Jun 2010 17:19:43 -0400
Matt Turner<mattst88@xxxxxxxxx> wrote:

Michael Cree and I have been debugging FDO bug 26403 [1]. I tried
booting with `radeon.test=1` and found this, which I think is related:

Note that my radeon card is PCI whereas I think Matt may be using an AGP card.

My logs are very similar to Matt's except I don't see the following line:

pci_map_single failed: could not allocate dma page tables

This happens in the latest git, right?

Indeed, testing 2.6.35-rc3 (plus a couple or so extra patches to fix unrelated compile errors).

Is this a regression (what kernel version worked)?

Seems that the IOMMU can't find 128 pages. It's likely due to:

- out of the IOMMU space (possibly someone doesn't free the IOMMU


- the mapping parameters (such as align) aren't appropriate so the
IOMMU can't find space.

I don't think KMS drivers have ever worked on alpha so its not a
regression, they are working fine on x86 + powerpc and sparc has been
run at least once.

KMS on the console boot up has worked since about 2.6.32, but starting up the X server has always failed and, in my case, the system becomes unstable and eventually OOPs.

I suspect we are simply hitting the limits of the iommu, how big an
address space does it handle? since generally graphics drivers try to
bind a lot of things to the GART.

No idea on the address space limit. I applied the patch of Fujita that logs all IOMMU allocations, and also inserted some extra printks in the ttm kernel code so that I could see which routines failed and the error code returned. Running the radeon test on boot exhibits the following:

[ 238.712768] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0x1a312000
[ 239.281127] [drm] Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0x1a412000
[ 239.281127] ttm_tt_bind belched -12
[ 239.282104] ttm_bo_handle_move_mem belched -12
[ 239.282104] ttm_bo_move_buffer belched -12
[ 239.282104] ttm_bo_validate belched -12
[ 239.282104] radeon 0000:01:00.0: object_init failed for (1048576, 0x00000002) err=-12
[ 239.282104] [drm:radeon_test_moves] *ERROR* Failed to create GTT object 419
[ 239.399291] Error while testing BO move.

Note that no IOMMU allocations are printed while radeon_test_moves is running so iommu_arena_alloc doesn't appear to be called. Also the error code returned up to radeon_test_moves is -12 which is ENOMEM. So does appear to be some memory limit.

It might be worth limiting the PCIGART in radeon to 32MB to see if the
lower limit helps.

So, how does one do that?

To unsubscribe from this list: send the line "unsubscribe linux-alpha" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at