Re: [patch] zoned-2.3.28-J5

Ingo Molnar (mingo@chiara.csoma.elte.hu)
Tue, 16 Nov 1999 11:07:14 +0100 (CET)


On Tue, 16 Nov 1999, Chris Evans wrote:

> > this is zoned-2.3.28-J5, and it should fix all (most) known problems.
>
> So, does this zoned buy us any performance? What other benefits are there?

zones are separate physical memory (RAM) areas. eg. zones in a 6GB box
look this way:

- zone 0: 0-16MB [ZONE_DMA]
- zone 1: 16MB-1GB (roughly) [ZONE_NORMAL]
- zone 2: 1GB-6GB [ZONE_HIGHMEM]

each zone is a 'pool of pages', with separate freelists and separate buddy
bitmaps. The 2.2 allocator had everything in one big zone.

Higher order requests 'steal' DMA pages only as a last resort - previously
GFP_DMA had to search for DMA-able pages by looking through all pages in a
given page-list, plus normal GFP_ requests took DMA pages. So the zone
allocator alone already gives much better GFP_DMA behavior, even on
smaller boxes. In the future there will be GFP_DMA32 too.

The top-level structure is the 'zonelists' array, which contains a
NULL-delimited list of 'target zones', in priority order. Eg. for
GFP_HIGHMEM (which now covers the majority of allocations done in a Linux
system) is { zone2, zone1, zone0, NULL }. For GFP_BUFFER it's { zone1,
zone0, NULL }. The 'gfp_mask' parameter of the allocation functions is now
an index into this 'zonelists' array, this gets resolved at compile-time
in 99% of the cases.

Some other checks have been moved into the inlined part as well and get
eliminated at compile-time. The page allocation entry points have been
reduced to the minimum of 2 (formerly we had separate free_pages() and
__free_page(), now it's all interfacing into __free_pages_ok()). The
result is a more streamlined and lightweight page allocator. (despite the
additional code it has). __free_page() is partly inlined now as well, the
'put_page_testzero()' thing is inlined, which is triggered in 60-70% of
the cases.

the 'zonelists' array is generated runtime (once at boot), so systems
which do not have highmem do not have to go through the empty zone every
time.

-- mingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/