Re: [PATCH v11 7/7] sched: Split scheduler and execution contexts

From: Juri Lelli
Date: Wed Jul 31 2024 - 05:12:10 EST


Hi John,

On 12/07/24 12:10, John Stultz wrote:
> On Fri, Jul 12, 2024 at 8:02 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Tue, Jul 09, 2024 at 01:31:50PM -0700, John Stultz wrote:
> > > From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > >
> > > Let's define the scheduling context as all the scheduler state
> > > in task_struct for the task selected to run, and the execution
> > > context as all state required to actually run the task.
> > >
> > > Currently both are intertwined in task_struct. We want to
> > > logically split these such that we can use the scheduling
> > > context of the task selected to be scheduled, but use the
> > > execution context of a different task to actually be run.
> > >
> > > To this purpose, introduce rq_selected() macro to point to the
> > > task_struct selected from the runqueue by the scheduler, and
> > > will be used for scheduler state, and preserve rq->curr to
> > > indicate the execution context of the task that will actually be
> > > run.
> >
> > > * Swapped proxy for selected for clarity
> >
> > I'm not loving this naming... what does selected even mean? What was
> > wrong with proxy? -- (did we have this conversation before?)
>
> So yeah, this came up earlier:
> https://lore.kernel.org/lkml/CANDhNCr3acrEpBYd2LVkY3At=HCDZxGWqbMMwzVJ-Mn--dv3DA@xxxxxxxxxxxxxx/
>
> My big concern is that the way proxy was used early in the series
> seemed to be inverted from how the term is commonly used.
>
> A proxy is one who takes an action on behalf of someone else.
>
> In this case we have a blocked task that was picked to run, but then
> we run another task in its place. Intuitively, this makes the proxy
> the one that actually runs, not the one that was picked. But the
> earliest versions of the patch had this flipped, and caused lots of
> conceptual confusion in the discussions I had with folks about what
> the patch was doing (as well as my own confusion initially working on
> the patch).

I don't think I have strong preferences either way, but I actually
considered the proxy to be the blocked donor (the one picked by the
scheduler to run), as it makes the owner use its properties, acting as a
proxy for the owner.

I think that in this case find_proxy_task() might have a confusing name,
as it doesn't actually try to find a proxy, but rather the owner of a
(or a chain of) mutex. We could rename that find_owner_task() and it
seems it might make sense, as we would have

pick_again:
next = pick_next_task(rq, rq_proxy(rq), &rf);
rq_set_proxy(rq, next);
if (unlikely(task_is_blocked(next))) {
next = find_owner_task(rq, next, &rf);

where, if next was blocked, we search for the owner it is blocked on.

Anyway, just wanted to tell you the way I understood this so far. Happy
to go with what you and Peter decide to go with. :)

Best,
Juri