Re: [PATCH RESEND] sched: prefer an idle cpu vs an idle sibling for BALANCE_WAKE

From: Mike Galbraith
Date: Thu Jun 04 2015 - 00:52:42 EST


On Wed, 2015-06-03 at 16:34 -0400, Josef Bacik wrote:
> On 06/03/2015 01:43 PM, Mike Galbraith wrote:

> > There are also other loads like your server where waking to an idle cpu
> > dominates all else, pgbench is one of those. In that case, you've got a
> > 1:N waker/wakee relationship, and what matters above ALL else is when
> > the mother of all work (the single server thread) wants a CPU, it had
> > better get it NOW, else the load stalls. Likewise, 'mom' being
> > preempted hurts truckloads. Perhaps your server has a similar thing
> > going on, keeping wakees the hell away from the waker rules all.
> >
>
> Yeah our server has two waker threads (one per numa node) and then the N
> number of wakee threads. I'll run tbench and pgbench with the new
> patches and see if there's a degredation. Thanks,

If you look for wake_wide(), it could perhaps be used to select wider
search for only the right flavor load component when BALANCE_WAKE is
set. That would let the cache lovers in your box continue to perform
while improving the 1:N component. That wider search still needs to
become cheaper though, low hanging fruit being to stop searching when
you find load = 0.. but you may meet the energy efficient folks, who
iirc want to make it even more expensive.

wake_wide() inadvertently helped another sore spot btw - a gaggle of
pretty light tasks being awakened from an interrupt source tended to
cluster around that source, preventing such loads from being all they
can be in a very similar manner. Xen (shudder;) showed that nicely in
older kernels, due to the way its weird dom0 gizmo works.

-Mike



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/