Re: 2.6.25-rc7-git2: Reported regressions from 2.6.24

From: Christoph Lameter
Date: Mon Mar 31 2008 - 14:47:54 EST


On Sat, 29 Mar 2008, Linus Torvalds wrote:

>
> You don't have a f*cking clue about this cocde that you're supposed to be
> maintaining, do you?
>
> See "slab_alloc()". See the code:
>
> if (unlikely((gfpflags & __GFP_ZERO) && object))
> memset(object, 0, c->objsize);
>
> and see how it does it regardless of anything else.

Yes I am very aware of that.

> In short, if *any* code-path calls down to any allocator from that routine
> with GFP_ZERO set, it's a bug. No ifs, buts or maybes about it. It
> shouldn't do that, because the actual memset() is done by slab_alloc(),
> and should not be done ANYWHERE ELSE.
>
> It has *nothing* to do with "object is too big" or anything else.

It has to do how large objects are allocated through kmalloc_large().
kmalloc_large() is elsewhere called with unfiltered gfpflags and relies
on zeroing being handled by the page allocator. It can take unfiltered gfp
flags.

The filtering of __GFP_ZERO that you added avoids the double zeroing for
the fallback path (which is only called if all the partial lists are empty
and after the page allocator went through reclaim and did not get the
large sized memory we wanted). So okay the patch could be a performance
enhancement. But then it adds the filtering to the hot path instead of the
code path that containts the kmalloc_large that is executed once in a blue
moon. The hot path should only filter when we actually decide that we need
to allocate a new slab from the page allocator.

It seemed to me that the reason for inserting the filtering of __GFP_ZERO
there was the belief that the page allocator cannot take __GFP_ZERO
through kmalloc_large() if we are in an interrupt.

The use of kmalloc_large() in __slab_alloc() is a bit strange at this
point. The cleanup work in 2.6.26 will make this all nice again.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/