Re: [PATCH] sched/pi: Reweight fair_policy() tasks when inheriting prio

From: Vincent Guittot
Date: Wed Apr 03 2024 - 09:54:29 EST


On Wed, 3 Apr 2024 at 15:40, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> On Wed, 3 Apr 2024 15:11:06 +0200
> Vincent Guittot <vincent.guittot@xxxxxxxxxx> wrote:
>
> > On Wed, 3 Apr 2024 at 02:59, Qais Yousef <qyousef@xxxxxxxxxxx> wrote:
> > >
> > > For fair tasks inheriting the priority (nice) without reweighting is
> > > a NOP as the task's share won't change.
> >
> > AFAICT, there is no nice priority inheritance with rt_mutex; All nice
> > tasks are sorted with the same "default prio" in the rb waiter tree.
> > This means that the rt top waiter is not the cfs with highest prio but
> > the 1st cfs waiting for the mutex.
>
> I think the issue here is that the running process doesn't update its
> weight and if there are other tasks that are not contending on this mutex,
> they can still starve the lock owner.

But I think it's on purpose because we don't boost cfs tasks and we
never boost them. That could be a good thing to do but I think that
the current code has not been done for that and this might raise other
problem. I don't think it's an oversight

>
> IIUC (it's been ages since I looked at the code), high nice values (low
> priority) turn to at lease nice 0 when they are "boosted". It doesn't
> improve their chances of getting the lock though.
>
> >
> > >
> > > This is visible when running with PTHREAD_PRIO_INHERIT where fair tasks
> > > with low priority values are susceptible to starvation leading to PI
> > > like impact on lock contention.
> > >
> > > The logic in rt_mutex will reset these low priority fair tasks into nice
> > > 0, but without the additional reweight operation to actually update the
> > > weights, it doesn't have the desired impact of boosting them to allow
> > > them to run sooner/longer to release the lock.
> > >
> > > Apply the reweight for fair_policy() tasks to achieve the desired boost
> > > for those low nice values tasks. Note that boost here means resetting
> > > their nice to 0; as this is what the current logic does for fair tasks.
> >
> > But you can at the opposite decrease the cfs prio of a task
> > and even worse with the comment :
> > /* XXX used to be waiter->prio, not waiter->task->prio */
> >
> > we use the prio of the top cfs waiter (ie the one waiting for the
> > lock) not the default 0 so it can be anything in the range [-20:19]
> >
> > Then, a task with low prio (i.e. nice > 0) can get a prio boost even
> > if this task and the waiter are low priority tasks
>
>
> Yeah, I'm all confused to exactly how the inheritance works with
> SCHED_OTHER. I know John Stultz worked on this for a bit recently. He's
> Cc'ed. But may not be paying attention ;-)
>
> -- Steve