Re: [PATCH -tip 15/32] sched: Improve snapshotting of min_vruntime for CGroups

From: Peter Zijlstra
Date: Tue Nov 24 2020 - 05:29:03 EST


On Tue, Nov 17, 2020 at 06:19:45PM -0500, Joel Fernandes (Google) wrote:
> A previous patch improved cross-cpu vruntime comparison opertations in
> pick_next_task(). Improve it further for tasks in CGroups.
>
> In particular, for cross-CPU comparisons, we were previously going to
> the root-level se(s) for both the task being compared. That was strange.
> This patch instead finds the se(s) for both tasks that have the same
> parent (which may be different from root).
>
> A note about the min_vruntime snapshot and force idling:
> Abbreviations: fi: force-idled now? ; fib: force-idled before?
> During selection:
> When we're not fi, we need to update snapshot.
> when we're fi and we were not fi, we must update snapshot.
> When we're fi and we were already fi, we must not update snapshot.
>
> Which gives:
> fib fi update?
> 0 0 1
> 0 1 1
> 1 0 1
> 1 1 0
> So the min_vruntime snapshot needs to be updated when: !(fib && fi).
>
> Also, the cfs_prio_less() function needs to be aware of whether the core
> is in force idle or not, since it will be use this information to know
> whether to advance a cfs_rq's min_vruntime_fi in the hierarchy. So pass
> this information along via pick_task() -> prio_less().

Hurmph.. so I'm tempted to smash a bunch of patches together.

2 <- 3 (already done - bisection crashes are daft)
6 <- 11
7 <- {10, 12}
9 <- 15

I'm thinking that would result in an easier to read series, or do we
want to preserve this history?

(fwiw, I pulled 15 before 13,14, as I think that makes more sense
anyway).

Hmm?