Re: [PATCH 2/2 v2] sched/wait: Introduce lock breaker in wake_up_page_bit

From: Christopher Lameter
Date: Thu Sep 14 2017 - 12:40:03 EST

Next message: Andreas Dilger: "Re: kernel BUG at fs/ext4/fsync.c:LINE!"
Previous message: Tycho Andersen: "Re: [PATCH] xen: don't compile pv-specific parts if XEN_PV isn't configured"
In reply to: Linus Torvalds: "Re: [PATCH 2/2 v2] sched/wait: Introduce lock breaker in wake_up_page_bit"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, 13 Sep 2017, Tim Chen wrote:

> Here's what the customer think happened and is willing to tell us.
> They have a parent process that spawns off 10 children per core and
> kicked them to run. The child processes all access a common library.
> We have 384 cores so 3840 child processes running. When migration occur on
> a page in the common library, the first child that access the page will
> page fault and lock the page, with the other children also page faulting
> quickly and pile up in the page wait list, till the first child is done.

I think we need some way to avoid migration in cases like this. This is
crazy. Page migration was not written to deal with something like this.

Next message: Andreas Dilger: "Re: kernel BUG at fs/ext4/fsync.c:LINE!"
Previous message: Tycho Andersen: "Re: [PATCH] xen: don't compile pv-specific parts if XEN_PV isn't configured"
In reply to: Linus Torvalds: "Re: [PATCH 2/2 v2] sched/wait: Introduce lock breaker in wake_up_page_bit"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]