Re: [patch] SLQB slab allocator
From: Hugh Dickins
Date: Tue Feb 03 2009 - 07:19:41 EST
On Tue, 3 Feb 2009, Zhang, Yanmin wrote:
> On Mon, 2009-02-02 at 11:00 +0200, Pekka Enberg wrote:
> > On Mon, 2009-02-02 at 11:38 +0800, Zhang, Yanmin wrote:
> > > Can we add a checking about free memory page number/percentage in function
> > > allocate_slab that we can bypass the first try of alloc_pages when memory
> > > is hungry?
> >
> > If the check isn't too expensive, I don't any reason not to. How would
> > you go about checking how much free pages there are, though? Is there
> > something in the page allocator that we can use for this?
>
> We can use nr_free_pages(), totalram_pages and hugetlb_total_pages(). Below
> patch is a try. I tested it with hackbench and tbench on my stoakley
> (2 qual-core processors) and tigerton (4 qual-core processors).
> There is almost no regression.
May I repeat what I said yesterday? Certainly I'm oversimplifying,
but if I'm plain wrong, please correct me.
Having lots of free memory is a temporary accident following process
exit (when lots of anonymous memory has suddenly been freed), before
it has been put to use for page cache. The kernel tries to run with
a certain amount of free memory in reserve, and the rest of memory
put to (potentially) good use. I don't think we have the number
you're looking for there, though perhaps some approximation could
be devised (or I'm looking at the problem the wrong way round).
Perhaps feedback from vmscan.c, on how much it's having to write back,
would provide a good clue. There's plenty of stats maintained there.
>
> Besides this patch, I have another patch to try to reduce the calculation
> of "totalram_pages - hugetlb_total_pages()", but it touches many files.
> So just post the first simple patch here for review.
>
>
> Hugh,
>
> Would you like to test it on your machines?
Indeed I shall, starting in a few hours when I've finished with trying
the script I promised yesterday to send you. And I won't be at all
surprised if your patch eliminates my worst cases, because I don't
expect to have any significant amount of free memory during my testing,
and my swap testing suffers from slub's thirst for higher orders.
But I don't believe the kind of check you're making is appropriate,
and I do believe that when you try more extensive testing, you'll find
regressions in other tests which were relying on the higher orders.
If all of your testing happens to have lots of free memory around,
I'm surprised; but perhaps I'm naive about how things actually work,
especially on the larger machines.
Or maybe your tests are relying crucially on the slabs allocated at
system startup, when of course there should be plenty of free memory
around.
By the way, when I went to remind myself of what nr_free_pages()
actually does, my grep immediately hit this remark in mm/mmap.c:
* nr_free_pages() is very expensive on large systems,
I hope that's just a stale comment from before it was converted
to global_page_state(NR_FREE_PAGES)!
Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/