Re: Cache incoherencies

Russell King (rmk@milldev.demon.co.uk)
Fri, 27 Aug 1999 17:06:02 +0100 (BST)


Rogier Wolff said:
> gfp returns a page from a pool of pages.

To repeat, which may or may not be mapped in using page table
entries, depending on the architecture.

> If we would start out with an uncacheable region flagged as such, gfp
> can return uncacheable pages just like it can return DMA-able pages.

However, the way get_free_pages works, if you ask for a DMA page, it
guarantees that it is a page which CAN_DMA is true for. However, if
you don't ask for a DMA page, it can still return that same page.

This is handled in the horrendous RMQUEUE macro, which is highly
optimised for fast allocation (I believe that fast memory allocation
is critical to Linux).

> The number of different "classes" of memory is growing rapidly. I am
> suggesting to take a look at GFP, to see if it can be generalized to
> the point where it can be useful for multiple DMA-capable ranged. Alan
> says that there is a 27-bit limit for a certain sound-card. Sure.
> There is a 32bit limit for many PCI cards. etc.

This is a different issue from the cached/uncached memory region
altogether.

> Alan suggests that we wait until we really hit the wall. I'm saying
> I'd prefer to start looking for (general) solutions now (*).

Alan suggested that we don't touch GFP if we don't have to. I'm
saying that we don't have to. There is a much nicer solution.

> Once gfp handles multiple, overlapping types of memory, it can easily
> handle the pre-allocated uncacheable pages.

Alan said that the gfp flags are inadequate, but we'll tackle that when
we have to. We have to tackle making non-cacheable pages first, which
can be done in the way that Benjamin and myself are suggesting, with
the minimum number of changes. I also believe it to be the best all
round.

> Now that last case is a bit special: You can create more of them on
> demand. In fact, for lots of cases you can create more of them on
> demand. For example, you can force a user-page into non-dma-able
> memory if you're looking for a dma-albe page. So calling GFP you could
> pass a flag saying "If not immediately available, try and get some
> more". For DMA-able pages, that may mean swapping userpages to disk or
> to "high" memory. For uncacheable pages, that may mean giving an empty
> page a new, uncached mapping, and flushing the cache while we're at
> it. Costly, but "you asked for it".

So what you're advocating for uncached pages is maintaining a pool
of pages which GFP can return, which is kept in addition to the
existing kernel space mapping of all pages, and grows as required by
calling GFP.

A region of memory will have to be found on all architectures that need
to support uncacheable areas to put this memory. In addition, extra
pmd's and pte's will have to be allocated for it. This will require
some code similar to vmalloc to handle the pgd, pmd and ptes. Well,
why don't we use this code to do it? Aren't we now back to the
vmalloc_unaligned (or alloc_unaligned) implementation?

The GFP idea has another flaw in it, which is that virt_to_phys and
virt_to_bus are defined such that they return the physical address or
bus address of the page returned by GFP or kmalloc. Allowing GFP to
return pages from a pool, which may contain randomly allocated pages
breaks this definition. This will make the use of virt_to_phys
conditional on how you call GFP, makes the code more messy and less
easy to understand.

Linus' philosophy has always been to 'do something, and do it well'.
get_free_pages returns pages, and it does it well. I suggest that
our general requirement is to get an uncacheable page. As I see it
thus far, the best way we can do that is to use the get_free_pages
(which does it well) and have another level on top of this that
creates an uncached page, and does that well.

> Currently (IIRC) there is a big macro instantiated twice to handle DMA
> or non-DMA memory. If we generalize this to handle many lists of free
> pages, we'll move to just one instantiation of "grab a block from a
> list". This is likely to have a smaller cache-footprint leading to a
> performance improvement.

Unless my sight is failing, grep and less only indicate that the
macro is instantiated only once to handle the memory allocation.

--
Russell King (rmk@milldev.demon.co.uk)

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/