On 05/06/2015 06:22 AM, Mel Gorman wrote:
On Wed, May 06, 2015 at 08:12:46AM +0100, Mel Gorman wrote:
On Tue, May 05, 2015 at 03:25:49PM -0700, Andrew Morton wrote:Which looks as follows. Waiman, a test on the 24TB machine would be
On Tue, 5 May 2015 23:13:29 +0100 Mel Gorman<mgorman@xxxxxxx> wrote:We'd have to distinguish between falling back to other zones because the
eh? Changes are only needed on the allocation-attempt-failed path,Alternatively, the page allocator can go off and synchronouslyThat was rejected during review of earlier attempts at this feature on
initialize some pageframes itself. Keep doing that until the
allocation attempt succeeds.
the grounds that it impacted allocator fast paths.
which is slow-path.
high zone is artifically exhausted and normal ALLOC_BATCH exhaustion. We'd
also have to avoid falling back to remote nodes prematurely. While I have
not tried an implementation, I expected they would need to be in the fast
paths unless I used jump labels to get around it. I'm going to try altering
when we initialise instead so that it happens earlier.
appreciated again. This patch should be applied instead of "mm: meminit:
Take into account that large system caches scale linearly with memory"
---8<---
mm: meminit: Finish initialisation of memory before basic setup
Waiman Long reported that 24TB machines hit OOM during basic setup when
struct page initialisation was deferred. One approach is to initialise memory
on demand but it interferes with page allocator paths. This patch creates
dedicated threads to initialise memory before basic setup. It then blocks
on a rw_semaphore until completion as a wait_queue and counter is overkill.
This may be slower to boot but it's simplier overall and also gets rid of a
lot of section mangling which existed so kswapd could do the initialisation.
Signed-off-by: Mel Gorman<mgorman@xxxxxxx>
This patch moves the deferred meminit from kswapd to its own kernel threads started after smp_init(). However, the hash table allocation was done earlier than that. It seems like it will still run out of memory in the 24TB machine that I tested on.
I will certainly try it out, but I doubt it will solve the problem on its own.