RE: [Linuxarm] Re: [PATCH v2] sched/topology: fix the issue groups don't span domain->span for NUMA diameter > 2
From: Valentin Schneider
Date: Thu Feb 18 2021 - 09:37:03 EST
Hi Barry,
On 18/02/21 09:17, Song Bao Hua (Barry Song) wrote:
> Hi Valentin,
>
> I understand Peter's concern is that the local group has different
> size with remote groups. Is this patch resolving Peter's concern?
> To me, it seems not :-)
>
If you remove the '&& i != cpu' in build_overlap_sched_groups() you get
that, but then you also get some extra warnings :-)
Now yes, should_we_balance() only matters for the local group. However I'm
somewhat wary of messing with the local groups; for one it means you would
have more than one tl now accessing the same sgc->next_update, sgc->{min,
max}capacity, sgc->group_imbalance (as Vincent had pointed out).
By ensuring only remote (i.e. !local) groups are modified (which is what
your patch does), we absolve ourselves of this issue, which is why I prefer
this approach ATM.
> Though I don’t understand why different group sizes will be harmful
> since all groups are calculating avg_load and group_type based on
> their own capacities. Thus, for a smaller group, its capacity would
> be smaller.
>
> Is it because a bigger group has relatively less chance to pull, so
> load balancing will be completed more slowly while small groups have
> high load?
>
Peter's point is that, if at a given tl you have groups that look like
g0: 0-4, g1: 5-6, g2: 7-8
Then g0 is half as likely to pull tasks with load_balance() than g1 or g2
(due to the group size vs should_we_balance())
However, I suppose one "trick" to be aware of here is that since your patch
*doesn't* change the local group, we do have e.g. on CPU0:
[ 0.374840] domain-2: span=0-5 level=NUMA
[ 0.375054] groups: 0:{ span=0-3 cap=4003 }, 4:{ span=4-5 cap=1988 }
*but* on CPU4 we get:
[ 0.387019] domain-2: span=0-1,4-7 level=NUMA
[ 0.387211] groups: 4:{ span=4-7 cap=3984 }, 0:{ span=0-1 cap=2013 }
IOW, at a given tl, all *local* groups have /roughly/ the same size and thus
similar pull probability (it took me writing this mail to see it that
way). So perhaps this is all fine already?