Re: TTM page pool allocator

From: Michel Dänzer
Date: Thu Jul 09 2009 - 04:48:33 EST


On Thu, 2009-07-09 at 16:06 +1000, Dave Airlie wrote:
> 2009/6/30 Thomas HellstrÃm <thomas@xxxxxxxxxxxx>:
> > Jerome Glisse skrev:
> >>
> >> On Fri, 2009-06-26 at 10:00 +1000, Dave Airlie wrote:
> >>
> >>>
> >>> On Thu, Jun 25, 2009 at 10:01 PM, Jerome Glisse<glisse@xxxxxxxxxxxxxxx>
> >>> wrote:
> >>>
> >>>>
> >>>> Hi,
> >>>>
> >>>> Thomas i attach a reworked page pool allocator based on Dave works,
> >>>> this one should be ok with ttm cache status tracking. It definitely
> >>>> helps on AGP system, now the bottleneck is in mesa vertex's dma
> >>>> allocation.
> >>>>
> >>>>
> >>>
> >>> My original version kept a list of wb pages as well, this proved to be
> >>> quite a useful
> >>> optimisation on my test systems when I implemented it, without it I
> >>> was spending ~20%
> >>> of my CPU in getting free pages, granted I always used WB pages on
> >>> PCIE/IGP systems.
> >>>
> >>> Another optimisation I made at the time was around the populate call,
> >>> (not sure if this
> >>> is what still happens):
> >>>
> >>> Allocate a 64K local BO for DMA object.
> >>> Write into the first 5 pages from userspace - get WB pages.
> >>> Bind to GART, swap those 5 pages to WC + flush.
> >>> Then populate the rest with WC pages from the list.
> >>>
> >>> Granted I think allocating WC in the first place from the pool might
> >>> work just as well since most of the DMA buffers are write only.
> >>>
> >>> Dave.
> >>> --
> >>>
> >>
> >> Attached a new version of the patch, which integrate changes discussed.
> >>
> >> Cheers,
> >> Jerome
> >>
> >
> > Hi, Jerome!
> > Still some outstanding things:
> >
> > 1) The AGP protection fixes compilation errors when AGP is not enabled, but
> > what about architectures that need the map_page_into_agp() semantics for TTM
> > even when AGP is not enabled? At the very least TTM should be disabled on
> > those architectures. The best option would be to make those calls non-agp
> > specific.
> >
> > 2) Why is the page refcount upped with get_page() after an alloc_page()?
> >
> > 3) It seems like pages are cache-transitioned one-by-one when freed. Again,
> > this is a global TLB flush per page. Can't we free a large chunk of pages at
> > once?
> >
>
> Jerome,
>
> have we addressed these?
>
> I'd really like to push this soon, as I'd like to fix up the 32 vs 36
> bit dma masks if possible
> which relies on us being able to tell the allocator to use GFP_DMA32 on some hw
> (32-bit PAE mainly with a PCI card).

FWIW, I tried this patch on my PowerBook, and it didn't go too well:

With AGP enabled, the kernel panics before there's any KMS display.

With AGP disabled, I get a KMS display, but it shows a failure to
allocate the ring buffer, and then it stops updating.


--
Earthling Michel DÃnzer | http://www.vmware.com
Libre software enthusiast | Debian, X and DRI developer
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/