Re: [PATCH RT 00/02] SLOB optimizations

From: Matt Mackall
Date: Tue Dec 20 2005 - 13:19:54 EST

Next message: art: "kernel-2.6.15-rt4 compilation error: ?RWSEM_ACTIVE_BIAS? undeclared"
Previous message: Avi Kivity: "Re: [RFC][PATCH] New iovec support & VFS changes"
In reply to: Steven Rostedt: "Re: [PATCH RT 00/02] SLOB optimizations"
Next in thread: Steven Rostedt: "Re: [PATCH RT 00/02] SLOB optimizations"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Dec 20, 2005 at 10:44:20AM -0500, Steven Rostedt wrote:
> (Andrew, I'm CC'ing you and Matt to see if you would like this in -mm)
>
> OK Ingo, here it is.
>
> The old SLOB did the old K&R memory allocations.
>
> It had a global link list "slobfree". When it needed memory it would
> search this list linearly to find the first spot that fit and then
> return it. It was broken up into SLOB_UNITS which was the number of
> bytes to hold slob_t.
>
> Since the sizes of the allocations would greatly fluctuate, the chances
> for fragmentation was very high. This would also cause the looking for
> free locations to increase, since the number of free blocks would also
> increase due to the fragmentation.

On the target systems for the original SLOB design, we have less than
16MB of memory, so the linked list walking is pretty well bounded.

> It also had one global spinlock for ALL allocations. This would
> obviously kill SMP performance.

And again, the locking primarily exists for PREEMPT and small dual-core.
So I'm still a bit amused that you guys are using it for -RT.

> When any block was freed via kfree, it would first search all the big
> blocks to see if it was a large allocation, and if not, then it would
> search the slobfree list to find where it goes. Both taking two global
> spinlocks!

I don't think this is correct, or else indicates a bug. We should only
scan the big block list when the freed block was page-aligned.

> First things first, the first patch was to get rid of the bigblock list.
> I'm simple used the method of SLAB to use the lru list field of the
> corresponding page to store the pointer to the bigblock descriptor which
> has the information to free it. This got rid of the bigblock link list
> and global spinlock.

This I like a lot. I'd like to see a size/performance measurement of
this by itself. I suspect it's an unambiguous win in both categories.

> The next patch was the big improvement, with the largest changes. I
> took advantage of how the kmem_cache usage that SLAB also takes
> advantage of. I created a memory pool like the global one, but for
> every cache with a size less then PAGE_SIZE >> 1.

Hmm. By every size, I assume you mean powers of two. Which negates
some of the fine-grained allocation savings that current SLOB provides.

[...]
> So I have improved the speed of SLOB to almost that of SLAB!

Nice.

For what it's worth, I think we really ought to consider a generalized
allocator approach like Sun's VMEM, with various removable pieces.

Currently we've got something like this:

get_free_pages boot_mem idr resource_* vmalloc ...
|
slab
|
per_cpu/node
|
kmem_cache_alloc
|
kmalloc

We could take it in a direction like this:

generic range allocator (completely agnostic)
|
optional size buckets (reduced fragmentation, O(1))
|
optional slab (cache-friendly, pre-initialized)
|
optional per cpu/node caches (cache-hot and lockless)
|
kmalloc / kmem_cache_alloc / boot_mem / idr / resource_* / vmalloc / ...

(You read that right, the top level allocator can replace all the
different allocators that hand back integers or non-overlapping ranges.)

Each user of, say, kmem_create() could then pass in flags to specify
which caching layers ought to be bypassed. IDR, for example, would
probably disable all the layers and specify a first-fit policy.

And then depending on our global size and performance requirements, we
could globally disable some layers like SLAB, buckets, or per_cpu
caches. With all the optional layers disabled, we'd end up with
something much like SLOB (but underneath get_free_page!).

--
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: art: "kernel-2.6.15-rt4 compilation error: ?RWSEM_ACTIVE_BIAS? undeclared"
Previous message: Avi Kivity: "Re: [RFC][PATCH] New iovec support & VFS changes"
In reply to: Steven Rostedt: "Re: [PATCH RT 00/02] SLOB optimizations"
Next in thread: Steven Rostedt: "Re: [PATCH RT 00/02] SLOB optimizations"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]