Re: [PATCH 1/2] sched/wait: Break up long wake list walk

From: Linus Torvalds
Date: Wed Aug 23 2017 - 14:17:35 EST


On Wed, Aug 23, 2017 at 8:58 AM, Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> wrote:
>
> Will you still consider the original patch as a fail safe mechanism?

I don't think we have much choice, although I would *really* want to
get this root-caused rather than just papering over the symptoms.

Maybe still worth testing that "sched/numa: Scale scan period with
tasks in group and shared/private" patch that Mel mentioned.

In fact, looking at that patch description, it does seem to match this
particular load a lot. Quoting from the commit message:

"Running 80 tasks in the same group, or as threads of the same process,
results in the memory getting scanned 80x as fast as it would be if a
single task was using the memory.

This really hurts some workloads"

So if 80 threads causes 80x as much scanning, a few thousand threads
might indeed be really really bad.

So once more unto the breach, dear friends, once more.

Please.

The patch got applied to -tip as commit b5dd77c8bdad, and can be
downloaded here:

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=b5dd77c8bdada7b6262d0cba02a6ed525bf4e6e1

(Hmm. It says it's cc'd to me, but I never noticed that patch simply
because it was in a big group of other -tip commits.. Oh well).

Linus