Re: [PATCH 0/9] sched: Prepare for sched_ext

From: Joel Fernandes
Date: Thu Aug 22 2024 - 08:59:47 EST


On Wed, Aug 21, 2024 at 5:41 PM Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
>
> On Tue, Aug 13, 2024 at 6:50 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > Hi,
> >
> > These patches apply on top of the EEVDF series (queue/sched/core), which
> > re-arranges the fair pick_task() functions to make them state invariant such
> > that they can easily be restarted upon picking (and dequeueing) a delayed task.
> >
> > This same is required to push (the final) put_prev_task() beyond pick_task(),
> > like we do for sched_core already.
> >
> > This in turn is done to prepare for sched_ext, which wants a final callback to
> > be in possesion of the next task, such that it can tell if the context switch
> > will leave the sched_class.
> >
> > As such, this all re-arranges the current order of:
> >
> > put_prev_task(rq, prev);
> > next = pick_next_task(rq); /* implies set_next_task(.first=true); */
> >
> > to sometihng like:
> >
> > next = pick_task(rq)
> > if (next != prev) {
> > put_prev_task(rq, prev, next);
> > set_next_task(rq, next, true);
> > }
> >
> > The patches do a fair bit of cleaning up. Notably a bunch of sched_core stuff
> > -- Joel, could you please test this stuff, because the self-tests we have are
> > hardly adequate.
> >
> > The EEVDF stuff was supposed to be merged already, but since Valentin seems to
> > be doing a read-through, I figured I'd give him a little extra time. A complete
> > set can be found at:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/prep
> >
>
> So I booted queue.git sched/core branch on a newish Chromebook (after
> applying 700 patches for making it boot and spending 2 days on it
> since we boot old kernels -- I wasn't joking when I said I would carve
> some time up for you this week :P).
>
> With sched/core , it boots fine with core scheduling disabled, but
> when core scheduling is enabled I am getting hard hangs and
> occasionally get to the login screen if I'm lucky. So there's
> definitely something wonky in sched/core branch and core sched.
> I could not get a trace or logs yet, since once it hangs I have to
> hard power off.
>
> I could bissect it tomorrow though since it looks like a manageable
> set of patches on 6.11-rc1. Or did you already figure out the issue?
>
> I am based on:
> commit aef6987d89544d63a47753cf3741cabff0b5574c
> Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Date: Thu Jun 20 13:16:49 2024 +0200
>
> sched/eevdf: Propagate min_slice up the cgroup hierarchy

One of these 29 in sched/core broke core-scheduling, causes hangs.
Haven't narrowed it down to which. Not much time today. Will probably
try to collect some logs.
https://hastebin.com/share/uqubojiqiy.yaml

Also I realized I should apply the 9 in this set too. But very least
it appears the above 29 broke core-sched vs bissection, probably the
delayed-dequeue or task-pick rework?

I will try the sched/prep branch now, which has the 9 in this set too..

thanks,

- Joel