Re: [PATCH 07/10] sched/fair: Provide can_migrate_task_llc

From: Steven Sistare
Date: Fri Oct 26 2018 - 14:29:33 EST


On 10/26/2018 2:04 PM, Valentin Schneider wrote:
> Hi Steve,
> On 22/10/2018 15:59, Steve Sistare wrote:
>> Define a simpler version of can_migrate_task called can_migrate_task_llc
>> which does not require a struct lb_env argument, and judges whether a
>> migration from one CPU to another within the same LLC should be allowed.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@xxxxxxxxxx>
>> ---
>> kernel/sched/fair.c | 28 ++++++++++++++++++++++++++++
>> 1 file changed, 28 insertions(+)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 4acdd8d..6548bed 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -7168,6 +7168,34 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
>> }
>>
>> /*
>> + * Return true if task @p can migrate from @rq to @dst_rq in the same LLC.
>> + * No need to test for co-locality, and no need to test task_hot(), as sharing
>> + * LLC provides cache warmth at that level.
>
> I was thinking that perhaps we could have scenarios where some rq's
> keep stealing tasks off of each other and we end up circulating tasks
> between CPUs. Now, that would only happen if we had a handful of tasks
> with a very tiny period, and I'm not familiar with (real) such hyperactive
> workloads similar to those generated by hackbench where that could happen.

That will not happen with the current code, as it only steals if nr_running > 1.
The src loses a task, the dst gains it and has nr_running == 1, so it will not
be re-stolen.

If we modify the code to handle misfits, we may steal when src nr_running == 1,
but a fast CPU will only steal the lone task from a slow one, never fast from fast,
and never slow from fast, so no tug of war.

> In short, I wonder if we should have task_hot() in there. Drawing a
> parallel with load_balance(), even if load-balancing is happening between
> rqs of the same LLC, we do go check task_hot(). Have you already experimented
> with adding a task_hot() check in here?

I tried task_hot, to see if L1/L2 cache warmth matters much on L1/L2/L3 systems,
and it reduced steals and overall performance.

> I've run some iterations of hackbench (hackbench 2 process 100000) to
> investigate this task bouncing, but I didn't really see any of it. That was
> just a 4+4 big.LITTLE system though, I'll try to get numbers on a system
> with more CPUs.
>
> ----->8-----
>
> activations: # of task activations (task starts running)
> cpu_migrations: # of activations where cpu != prev_cpu
> % stats are percentiles
>
> - STEAL:
>
> | stat | cpu_migrations | activations |
> |-------+----------------+-------------|
> | count | 2005.000000 | 2005.000000 |
> | mean | 16.244888 | 290.608479 |
> | std | 38.963138 | 253.003528 |
> | min | 0.000000 | 3.000000 |
> | 50% | 3.000000 | 239.000000 |
> | 75% | 8.000000 | 436.000000 |
> | 90% | 45.000000 | 626.000000 |
> | 99% | 188.960000 | 1073.000000 |
> | max | 369.000000 | 1417.000000 |
>
> - NO_STEAL:
>
> | stat | cpu_migrations | activations |
> |-------+----------------+-------------|
> | count | 2005.000000 | 2005.000000 |
> | mean | 15.260848 | 297.860848 |
> | std | 46.331890 | 253.210813 |
> | min | 0.000000 | 3.000000 |
> | 50% | 3.000000 | 252.000000 |
> | 75% | 7.000000 | 444.000000 |
> | 90% | 32.600000 | 643.600000 |
> | 99% | 214.880000 | 1127.520000 |
> | max | 467.000000 | 1547.000000 |
>
> ----->8-----
>
> Otherwise, my only other concern at the moment is that since stealing
> doesn't care about load, we could steal a task that would cause a big
> imbalance, which wouldn't have happened with a call to load_balance().
>
> I don't think this can be triggered with a symmetrical workload like
> hackbench, so I'll go explore something else.

The dst is about to go idle with zero load, so stealing can only improve the
instantaneous balance between src and dst. For longer term average load, we
still rely on periodic load_balance to make adjustments.

All good questions, keep them coming.

- Steve