Re: [RFC 1/2] x86_64, mm: Delay initializing large portion ofmemory
From: Rob Landley
Date: Tue Jun 25 2013 - 00:14:55 EST
On 06/21/2013 11:25:33 AM, Nathan Zimmer wrote:
On a 16TB system it can takes upwards of two hours to boot the system
with
about 60% of the time being spent initializing memory. This patch
delays
initializing a large portion of memory until after the system is
booted.
This can significantly reduce the time it takes the boot the system
down
to the 15 to 30 minute range.
Why is this conditional? Initialize the minimum amount of memory to
bring up each NUMA node, and then have each processor initialize its
own memory. I would have thought it was already doing this...
+ delay_mem_init=B:M:n:l:h
+ This delays the initialization of a large
portion of
+ memory by inserting it into the "absent" memory
list.
+ This allows the system to boot up much faster
at the
+ expense of the time needed to add this absent
memory
+ after the system has booted. That however can
be done
+ in parallel with other operations.
This seems like a giant advertisement primarily aimed at repeating why
you think we need to merge the patch, not explaining what it is or how
to use it.
I would rephrase:
Defer memory initialization until after SMP
init (so
large memory ranges can be initialized in
parallel) by
moving memory not needed during boot to the
"absent" list.
And I repeat: why do we need to micromanage this? It sounds like all
NUMA systems should do something like this. (Single-threaded memory
initialization in an SMP system is kind of weird.)
+ Format: B:M:n:l:h
+ (1 << B) is the block size (bsize)
+ ['0' indicates use the default
128M]
+ (1 << M) is the address space per node
+ (n * bsize) is minimum sized node memory to
slice
+ (l * bisze) is low memory to leave on node
+ (h * bisze) is high memory to leave on node
I don't understand this in the slightest. I understand "low memory to
leave on the node", I have no idea why there are four other parameters.
+config DELAY_MEM_INIT
+ bool "Delay memory initialization"
+ depends on EFI && MEMORY_HOTPLUG_SPARSE
+ ---help---
+ This option delays initializing a large portion of memory
+ until after the system is booted. This can significantly
+ reduce the time it takes the boot the system when there
+ is a significant amount of memory present. Systems with
+ 8TB or more of memory benefit the most.
I can see an SMP phone wanting to use this to shave a quarter second
off its boot time. Your "large portion of memory" description is a bit
myopic.
Rob--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/