Re: [RFC 1/2] sched: reduce migration cost between faster caches for idle_balance

From: Mike Galbraith
Date: Thu Feb 15 2018 - 23:54:21 EST


On Thu, 2018-02-15 at 10:07 -0800, Rohit Jain wrote:
>
> > Rohit is running more tests with a patch that deletes
> > sysctl_sched_migration_cost from idle_balance, and for his patch but
> > with the 5000 usec mistake corrected back to 500 usec. So far both
> > give improvements over the baseline, but for different cases, so we
> > need to try more workloads before we draw any conclusions.
> >
> > Rohit, can you share your data so far?
>
> Results:
>
> In the following results, "Domain based" approach is as mentioned in the
> RFC sent out with the values fixed (As pointed out by Mike). "No check" is
> the patch where I just remove the check against sysctl_sched_migration_cost
>
> 1) Hackbench results on 2 socket, 44 core and 88 threads Intel x86 machine
> (lower is better):
>
> +--------------+-----------------+--------------------------+-------------------------+
> |              | Without Patch   |Domain Based              |No Check                 |
> +------+-------+--------+--------+-----------------+--------+----------------+--------+
> |Loops | Groups|Average |%Std Dev|Average          |%Std Dev|Average         |%Std Dev|
> +------+-------+--------+--------+-----------------+--------+----------------+--------+
> |100000| 4     |9.701   |0.78    |7.971  (+17.84%) | 1.34   |8.919  (+8.07%) |1.07    |
> |100000| 8     |17.186  |0.77    |16.712 (+2.76%)  | 0.87   |17.043 (+0.83%) |0.83    |
> |100000| 16    |30.378  |0.55    |29.780 (+1.97%)  | 0.38   |29.565 (+2.67%) |0.29    |
> |100000| 32    |54.712  |0.54    |53.001 (+3.13%)  | 0.19   |52.158 (+4.67%) |0.22    |
> +------+-------+--------+--------+-----------------+--------+----------------+--------+

previous numbers.

+-------+----+-------+-------------------+--------------------------+
| | | | Without patch |With patch |
+-------+----+-------+---------+---------+----------------+---------+
|Loops |FD |Groups | Average |%Std Dev |Average |%Std Dev |
+-------+----+-------+---------+---------+----------------+---------+
|100000 |40 |4 | 9.701 |0.78 |9.623 (+0.81%) |3.67 |
|100000 |40 |8 | 17.186 |0.77 |17.068 (+0.68%) |1.89 |
|100000 |40 |16 | 30.378 |0.55 |30.072 (+1.52%) |0.46 |
|100000 |40 |32 | 54.712 |0.54 |53.588 (+2.28%) |0.21 |
+-------+----+-------+---------+---------+----------------+---------+

My take on this (not that you have to sell it to me, you don't) when I
squint at these together is submit the one-liner, and take the rest
back to the drawing board.  You've got nothing but high std dev numbers
in (imo) way too finicky/unrealistic hackbench to sell these not so
pretty patches.

I bet you can easily sell that one-liner, because that removes an old
wart (me stealing migration_cost in the first place), instead of making
wart a whole lot harder to intentionally not notice.

-Mike