Re: [PATCH -tip 10/32] sched: Fix priority inversion of cookied task with sibling

From: Balbir Singh
Date: Sun Nov 22 2020 - 17:41:51 EST


On Tue, Nov 17, 2020 at 06:19:40PM -0500, Joel Fernandes (Google) wrote:
> From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>
> The rationale is as follows. In the core-wide pick logic, even if
> need_sync == false, we need to go look at other CPUs (non-local CPUs) to
> see if they could be running RT.
>
> Say the RQs in a particular core look like this:
> Let CFS1 and CFS2 be 2 tagged CFS tags. Let RT1 be an untagged RT task.
>
> rq0 rq1
> CFS1 (tagged) RT1 (not tag)
> CFS2 (tagged)
>
> Say schedule() runs on rq0. Now, it will enter the above loop and
> pick_task(RT) will return NULL for 'p'. It will enter the above if() block
> and see that need_sync == false and will skip RT entirely.
>
> The end result of the selection will be (say prio(CFS1) > prio(CFS2)):
> rq0 rq1
> CFS1 IDLE
>
> When it should have selected:
> rq0 r1
> IDLE RT
>
> Joel saw this issue on real-world usecases in ChromeOS where an RT task
> gets constantly force-idled and breaks RT. Lets cure it.
>
> NOTE: This problem will be fixed differently in a later patch. It just
> kept here for reference purposes about this issue, and to make
> applying later patches easier.
>

The changelog is hard to read, it refers to above if(), whereas there
is no code snippet in the changelog. Also, from what I can see following
the series, p->core_cookie is not yet set anywhere (unless I missed it),
so fixing it in here did not make sense just reading the series.

Balbir