Re: Lockup 2.1.6* => kmalloc/slab ???

Mark Hemment (markhe@nextd.demon.co.uk)
Fri, 7 Nov 1997 19:57:49 +0000 (GMT)


Hi Frank/All,

On Thu, 6 Nov 1997, Frank van de Pol wrote:
> The failing kmalloc() apparently goes through the list of memory caches,
> trying to find one that is big enough for the requested size. In my case
> this is a 4096 byte cache.
>
> Then I lost track. In the __kmem_cache_alloc() function it should allocate a
> element, or increase the size of the 4096 byte cache if it is full. I get
> NO message from failures from this routine, as it should when returning a
> NULL...

(Note: There is a pointer to a patch towards the end....).

The SLAB allocator attempts to create cache's whose slabs contain several
objects - this helps efficiency - but with large object sizes this can
lead to large orders requested from the free-page cache.

For example, the "4096" cache uses a slab with 4 objects - that means an
allocation of 4 phyiscally contigious pages are needed to add one slab to
the cache.
Most slab-caches grow quickly after start-up (when a service is first
used), and do not grow much after that. As these allocators occur when
the free-pool has low fragmentation, they mostly succeed. Until a change
to the frequency (and priority) that kmem_cache_reap() is called, these
large slabs were rarely released (and hence there were only a few large
allocation requests to the free-pool when the fragmentation is high).

The simple solution, is to edit mm/slab.c and change the value of;
SLAB_BREAK_GFP_ORDER
from 2 to 1. This will make the "4096" cache, and others, throttle back
on the number of pages used for a slab. (In the "4096" cache, the slabs
will have an order 1 - two physically contigious pages).
If you are still having problems, then drop this value to 0 (but I
wouldn't really recommend this for performance).

If you are feeling a little more adventerious, running 2.1.62 on Intel,
and are _not_ using SMP or VM-cloned tasks (eg. not using a pthreads
library based upon clone()), then you might want to try the patch on my
home-page;
http://www.nextd.demon.co.uk/patch-colour-2.1.62.gz

This gives page-colouring (and a few other performance improvements)
coupled with a weak fragmentation control. I'm calling the control weak,
as I've out ripped all the heavy control stuff I was doing (well, was
doing this morning). It will also speed up most CPU/memory intensive
tasks.
The patch also allows the SLAB_BREAK_GFP_ORDER to be set from the boot
line. To set it to 1, use; "slab=4,1" (don't worry too much about the
4, it is the minimum objects per slab the allocator _tries_ to use for a
cache. Just keep it at '4').

If you use the patch, could you let me know the results?

Regards,

markhe