Re: [RFC 1/7] cpuset write dirty map

From: Christoph Lameter
Date: Wed Jun 27 2007 - 08:45:14 EST


On Wed, 27 Jun 2007, Andrew Morton wrote:

> I'm more concerned about all of Mel's code in -mm actually. I don't recall
> anyone doing a full review recently and I'm still not sure that this is the
> overall direction in which we wish to go. Last time I asked this everyone
> seemed a bit waffly and non-committal.

I have looked over this several times lately and it all seems quite okay.
Still cannot find a justification for the movable zone (never had to
use it in testing) but it seems that it makes memory unplug easier. The
antifrag patchset together with a page migration patch simplifies the
unplug patchset significantly.

I think the antifrag code is a significant step forward and will enable
lots of other features (memory unplug, larger page use in SLUB, huge page
allocs after boot). It may be useful to put memory compaction and memory
unplug in at the same time (I think we can get there even for .23) so that
we have a full package. With compaction we can finally recover from loads
that typically cause memory to be split in a lot of disjoint pieces and
get to a sitaution were we can dynamically reconfigure the number of huge
pages at run time (Our customers currently reboot to do this which is a
pain). Compaction increases the chance of I/O controllers being able to
merge I/O requests since contiguous pages can be served by the page
allocator again. Antifrag almost gets there but I can still construct
allocation scenarios that fragment memory significantly.

Also compaction is a requirement if we ever want to support larger
blocksizes. That would allow the removal of various layers that are now
needed to compensate for not supporting larger pages.

The whole approach is useful to increase performance. We have seen
several percentage points of performance wins with SLUB when allowing
larger pages sizes. The use of huge pages is also mainly done for
performance reasons. The large blocksize patch has shown a 50% performance
increase even in its prototype form where we certainly have not solved
server performance issues.

Even without large blocksize: The ability to restore the capability of the
page allocator to serve pages that are in sequence can be used to shorten
the scatter gather lists in the I/O layer speeding up I/O.

I think this is an important contribution that will move a lot of other
issues forward.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/