Re: [PATCH] TTM DMA pool v1.8

From: Thomas Hellstrom
Date: Mon Oct 03 2011 - 12:37:48 EST

On 09/30/2011 04:09 PM, Konrad Rzeszutek Wilk wrote:
On Fri, Sep 30, 2011 at 08:59:52AM +0200, Thomas Hellstrom wrote:

I'm really sorry for taking so long to review this.
That is OK. We all are busy - and it gave me some time to pretty
up the code even more.

I'd like to go through a couple of high-level things first before
reviewing the coding itself.

The page_alloc_func structure looks nice, but I'd like to have it
per ttm backend,
Meaning the 'struct ttm_backend_func' right?

Yes, that's right.

we would just need to make sure that the backend is alive when we
alloc / free pages.
The reason for this is that there may be backends that want to
allocate dma memory running simultaneously with those who don't.
As in for some TTM manager use the non-DMA, and for other DMA
(is_dma32 is set?) Or say two cards - one that has the concept
of MMU and one that does not and both of them are readeon?

For example, or let's say you have a low-end system that in the future needs to
allocate DMA memory, and want to plug in a high-end graphics card, like Radeon.

When the backend fires up, it can determine whether to use DMA
memory or not.
Or more of backends (and drivers) that do not have any concept
of MMU and just use framebuffers and such?

I think we would just have to stick in a pointer to the
appropiate 'struct ttm_page_alloc_func' (and the 'struct device')
in the 'struct ttm_backend_func'. And this would be done by
backend that would decided which one to use.

Yes, either that, or merge the two structs.

And the TTM could find out which page_alloc_func to use straight
from the ttm_backend_func and call that (along with the 'struct device'
also gathered from that structure). Rough idea (on top of the patches):

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index dd05db3..e7a0c3c 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -902,12 +902,12 @@ struct ttm_page_alloc_func ttm_page_alloc_default = {

int ttm_get_pages(struct list_head *pages, int flags,
enum ttm_caching_state cstate, unsigned count,
- dma_addr_t *dma_address, struct device *dev)
+ dma_addr_t *dma_address, struct ttm_backend *be)

I'd like to see it even more simple. If the ttm_backend_func is responsible for allocating pages,
ttm_get_pages would be called by the backend code, and the dma_addr_t pointer can be kept
in the backend object. No need to expose neither device nor dma address to core ttm that
really doesn't want to care. The dma_address is then available in the backend only
for binding / unbinding, and ttm_get_pages will be called exclusively by the backend, and its
interface can pass struct device.

And "ttm/tt: Move ttm_tt_set_page_caching implementation in TTM page pool code."
would still be there, except it would be done per ttm-backend. Well
by choosing which TTM page pool the TTM backend would use.


2) Memory accounting: If the number DMA pages are limited in a way
that the ttm memory global routines are not aware of. How do we
handle memory accounting? (How do we avoid exhausting IOMMU space)?
Good question. Not entirely sure about that. I know that on
SWIOTLB there is no limit (as you do not use the bounce buffer)
but not sure about VT-D, AMD-VI, GART, or when there is no IOMMU.

Let me get back to you on that.

Granted, the code _only_ gets activated when we use SWIOTLB right
now so the answer is "no exhausting" currently. Either way let me
dig a bit deeper.

I'm fine with it working OK with SWIOTLB now.
When we need to handle other situations, let's find out how to do it then.

3) Page swapping. Currently we just copy pages to shmem pages and
then free device pages. In the future we'd probably like to insert
non-dma pages directly into the swap cache. Is it possible to
differentiate dma pages from pages that are directly insertable?
Yes. The TTM DMA pool keeps track of which pages belong to which
pool and will reject non-dma pages (or pages which belong to
a different pool). It is fairly easy to expose that check
so that the generic TTM code can make the proper distinction.

But also - once you free a device page ('cached') it gets freed.
The other pages (writecombine, writeback, uncached) end up sitting
in a recycle pool to be used. This is believe how the current
TTM page code works right now (and this TTM DMA pool follows).

Yes, that's OK, as long as the system shrinker can free those pages.

The swapping code back (so from swap to pool) does not seem to
distinguish it that much - it just allocates a new page - and
then copies from whatever was in the swap cache?

This is something you were thinking to do in the future I presume?

Yes. If / when I do that, I might be adding a new backend function to put a ttm in an
"anonymous state", that is using only pages that can be inserted in the swap cache or passed
around to other devices, and to put a ttm in a "device" state, that copies it to device mappable pages.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at