Re: [PATCH: 003/012] Memory hotplug for new nodes v.2. (Wait table and zonelists initalization)
From: Yasunori Goto
Date: Sat Feb 18 2006 - 01:38:04 EST
> > Signed-off-by: Dave Hansen <haveblue@xxxxxxxxxx>
>
> I have _not_ signed off on these patches. I understand you based the
> initial code from one of my patches, but please remove my signed-off-by,
> as I think they code has diverged sufficiently from mine.
I see.
> > Index: pgdat3/mm/page_alloc.c
> > ===================================================================
> > --- pgdat3.orig/mm/page_alloc.c 2006-02-17 16:52:50.000000000 +0900
> > +++ pgdat3/mm/page_alloc.c 2006-02-17 18:41:52.000000000 +0900
> > @@ -37,6 +37,7 @@
> > #include <linux/nodemask.h>
> > #include <linux/vmalloc.h>
> > #include <linux/mempolicy.h>
> > +#include <linux/stop_machine.h>
> >
> > #include <asm/tlbflush.h>
> > #include "internal.h"
> > @@ -2077,18 +2078,35 @@ void __init setup_per_cpu_pageset(void)
> > static __meminit
> > void zone_wait_table_init(struct zone *zone, unsigned long zone_size_pages)
> > {
> > - int i;
> > + int i, hotadd = (system_state == SYSTEM_RUNNING);
> > struct pglist_data *pgdat = zone->zone_pgdat;
> >
> > /*
> > * The per-page waitqueue mechanism uses hashed waitqueues
> > * per zone.
> > */
> > - zone->wait_table_size = wait_table_size(zone_size_pages);
> > - zone->wait_table_bits = wait_table_bits(zone->wait_table_size);
> > - zone->wait_table = (wait_queue_head_t *)
> > - alloc_bootmem_node(pgdat, zone->wait_table_size
> > - * sizeof(wait_queue_head_t));
> > + if (hotadd){
> > + unsigned long size = 4096UL; /* Max size */
>
> Where does this come from?
This comes from in wait_table_size().
1656 /*
1657 * Once we have dozens or even hundreds of threads sleeping
1658 * on IO we've got bigger problems than wait queue collision.
1659 * Limit the size of the wait table to a reasonable size.
1660 */
1661 size = min(size, 4096UL);
1662
> > + wait_queue_head_t *p;
> > +
> > + while (size){
> > + p = kmalloc(size * sizeof(wait_queue_head_t),
> > + GFP_ATOMIC);
> > + if (p)
> > + break;
> > + size >>= 1;
> > + }
>
> Huh? Trying to allocate the largest wait queue that kmalloc will let
> you?
>
> Don't we want to base the wait queue size on something devised at
> runtime? If we make a bad guess here, how is it fixed later?
At least, could you mention how bad is biggest size?
I don't think I understand how you mind it... :-(.
Is it performance?
The comment of wait_table_size() says that 4096 is enough size.
And I think even if one section size is 16MB, user would like to add
more memory like over 4GB. Because memory hotplug is feature for
mission critical system. In other words, DB will require much memory.
Even if wait_table_size become variable dynamically, it doen't have
much value I think......
--
Yasunori Goto
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/