Re: what is the purpose of SLAB and SLUB (was: Re: [PATCH v3] mm/slab: Improve performance of gathering slabinfo) stats

From: Christoph Lameter
Date: Tue Aug 30 2016 - 15:32:07 EST

On Tue, 30 Aug 2016, Mel Gorman wrote:

> > Userspace mapped pages can be hugepages as well as giant pages and that
> > has been there for a long time. Intermediate sizes would be useful too in
> > order to avoid having to keep lists of 4k pages around and continually
> > scan them.
> >
> Userspace pages cannot always be mapped as huge or giant. mprotect on a
> 4K boundary is an obvious example.

Well if the pages are bigger then the boundaries will also be different.
The problem is that we are trying to keep the 4k illustion alive. This
causes churn in various subsystems. Implementation of a file cache
with arbitrary page order is rather straightforward. See

There we run again against the problem of defragmentation. Avoiding decent
garbage collection in the kernel causes no end of additional trouble. I
think we need to face the issue and solve it. Then a lot of other
workaround and complex things are no longer necesary.

> > > Dirty tracking of pages on a 4K boundary will always be required to avoid IO
> > > multiplier effects that cannot be side-stepped by increasing the fundamental
> > > unit of allocation.
> >
> > Huge pages cannot be dirtied?
> I didn't say that, I said they are required to avoid IO multiplier
> effects. If a file is mapped as 2M or 1G then even a 1 byte write requires
> 2M or 1G of IO to writeback.

There are numerous use cases that I know of where this would be
acceptable. Some tuning would be required of course like a mininum period
until writeback occurs.

> > This is an issue of hardware support. On
> > x867 you only have one size. I am pretty such that even intel would
> > support other sizes if needed. The case has been repeatedly made that 64k
> > pages f.e. would be useful to have on x86.
> >
> 64K pages are not a universal win even on the arches that do support them.

There are always corner cases that regress with any kernel "enhancement".
64k page size was a signicant improvement for many of the loads when I
worked at SGI on Altix.