Re: [RFC PATCH 09/11] sched/rt: Fix proxy/current (push,pull)ability

From: Valentin Schneider
Date: Mon Oct 10 2022 - 07:40:16 EST

Next message: kernel test robot: "ld.lld: error: undefined symbol: backlight_device_get_by_name"
Previous message: Janosch Frank: "Re: [PATCH v14 1/6] KVM: s390: pv: asynchronous destroy for reboot"
Next in thread: Connor O'Brien: "Re: [RFC PATCH 09/11] sched/rt: Fix proxy/current (push,pull)ability"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 03/10/22 21:44, Connor O'Brien wrote:
> From: Valentin Schneider <valentin.schneider@xxxxxxx>

This was one of my attempts at fixing RT load balancing (the BUG_ON in
pick_next_pushable_task() was quite easy to trigger), but I ended up
convincing myself this was insufficient - this only "tags" the donor and
the proxy, the entire blocked chain needs tagging. Hopefully not all of
what I'm about to write is nonsense, some of the neurons I need for this
haven't been used in a while - to be taken with a grain of salt.

Consider pick_highest_pushable_task() - we don't want any task in a blocked
chain to be pickable. There's no point in migrating it, we'll just hit
schedule()->proxy(), follow p->blocked_on and most likely move it back to
where the rest of the chain is. This applies any sort of balancing (CFS,
RT, DL).

ATM I think PE breaks the "run the N highest priority task on our N CPUs"
policy. Consider:

p0 (FIFO42)
|
| blocked_on
v
p1 (FIFO41)
|
| blocked_on
v
p2 (FIFO40)

Add on top p3 an unrelated FIFO1 task, and p4 an unrelated CFS task.

CPU0
current: p0
proxy: p2
enqueued: p0, p1, p2, p3

CPU1
current: p4
proxy: p4
enqueued: p4

pick_next_pushable_task() on CPU0 would pick p1 as the next highest
priority task to push away to e.g. CPU1, but that would be undone as soon
as proxy() happens on CPU1: we'd notice the CPU boundary and punt it back
to CPU0. What we would want here is to pick p3 instead to have it run on
CPU1.

I *think* we want only the proxy of an entire blocked-chain to be visible
to load-balance, unfortunately PE gathers the blocked-chain onto the
donor's CPU which kinda undoes that.

Having the blocked tasks remain in the rq is very handy as it directly
gives us the scheduling context and we can unwind the blocked chain for the
execution context, but it does wreak havock in load-balancing :/

Next message: kernel test robot: "ld.lld: error: undefined symbol: backlight_device_get_by_name"
Previous message: Janosch Frank: "Re: [PATCH v14 1/6] KVM: s390: pv: asynchronous destroy for reboot"
Next in thread: Connor O'Brien: "Re: [RFC PATCH 09/11] sched/rt: Fix proxy/current (push,pull)ability"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]