Re: Sum of weights idea for CFS PI
From: Youssef Esmat
Date: Thu Oct 06 2022 - 15:40:29 EST
On Thu, Oct 6, 2022 at 8:53 AM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Oct 5, 2022 at 6:04 AM Qais Yousef <qais.yousef@xxxxxxx> wrote:
> >
> > Hi Joel
> >
> > On 10/04/22 16:27, Joel Fernandes wrote:
> >
> > [...]
> >
> > > I am treating the following the same:
> > >
> > > a. when A is running, it would be as above.
> > > b. but if A was sleeping, B, C, and D would get 1/3.
> > >
> > > similar to
> > >
> > > a. when A is running *and blocked on C for all its runtime*
> > > ^^ -- in this case, B and D should not have their distributions
> > > changed at all because they are not participating in the
> > > lock acquire and release. So they should neither be hurt
> > > any more, nor be boosted. They should simply stay same [1]
> > >
> > > b. but if A was sleeping, B, C, and D would get 1/3.
> > >
> > >
> > > [1] Why? Consider 3 tasks in the all-RT case, A high, B medium and C low prio.
> > >
> > > If all are running 100% and A does not block on C, B is blocked by A
> > > indefinitely. So the prio of A and B are inverted. We seek to rectify this, that
> > > is we need make changes such that, B is returned back to the blocked state. We
> > > do this by boosting C.
> > >
> > > In other words, the prio inheritance will cause B's distribution to not be
> > > changed (it was supposed to be blocked before and it is now going to be blocked
> > > state again).
> > >
> > > CFS should not behave any differently, B's distribution should not be changed
> > > before/after the priority inhertiance of A by C. That's just my opinion - and
> > > that's how I calculated to distribution. With that mind, could you go back to
> > > seeing if my math was originally correct or did I mess something up?
> >
> > It's not about the math. But I think the before and after can't be the same for
> > C..
>
> C is acquiring/releasing the lock so I expect its distribution to
> change. I was talking about the poor B who has nothing to do with the
> lock.
>
> > > > I don't think this is valid. If A is blocked on C for 50% of the time, and
> > > > sleeping for 50% of the time, when did it get blocked/unblocked?
> > > >
> > > > This will have an impact on the average share for C and skew it, no?
> > > >
> > > > Unless I missed something, the average share of C being (3/5 + 1/3) is an
> > > > impossible state. You need to consider the portion of time when C runs as 1/5,
> > > > when A is actually not blocked on anything, too.
> > > >
> > > > Hmm actually I just re-read your statement below and you just say 3/5 (18/30)
> > > > is too much. You didn't consider the average. I'll leave the above in hope to
> > > > help me understand what am I missing and where I went wrong :-)
> > > >
> > > > Generally IMHO looking at the average will not help. I think if the share
> > > > values make sense in each state individually (and I believe they are), that
> > > > would be enough. AFAICS, B and D are still taking the right amount of time when
> > > > C inherits the bandwidth. And C by definition will run longer when A is blocked
> > > > on it for the whole duration of this blocked time.
> > >
> > > I was degenerating the case where A sleeps (say I/O) vs A blocks, to simplify
> > > the math, and then taking average of that. I think that's reasonable?
> >
> > I'm not sure. This is skewing the results in my view.
> >
> > I think the comparison should just be:
> >
> > 1) A, B, C, and D are all running and nothing gets blocked at all. Then shares
> > would be:
> >
> > 2/5, 1/5, 1/5, 1/5
> >
> > 2) A is blocked and C; B, C, D are running with no blocked time. Shares would
> > be:
> >
> > - , 1/5, 3/5, 1/5
> >
> > By definition, we want to treat A in (2) as RUNNING because as soon as
> > C unblocks A we should return to (1). From B and D perspective, their share is
> > not impacted throughout this transition. Which is AFAIU is what we want to
> > achieve.
> >
> > I think considering the sleeping time and averaging can lead to misleading
> > results if care is not taken.
>
> Yes, but that doesn't mean we can just ignore it. It is easy in my
> view to skew the inherited weight to a very large number, only to find
> that tasks unrelated to the lock acquire/release are "suffering"
> though they had nothing to do with the lock or the PI. But it is
> reasonable to try the simple approach first and see the impact.
>
> I also never said the averaging approach or consideration of sleeping
> time is perfect ;-)
>
> > Anyway - just trying to explain how I see it and why C is unlikely to be taking
> > too much time. I could be wrong. As Youssef said, I think there's no
> > fundamental problem here.
>
> I know on Android where they use smaller HZ, the large tick causes
> lots of problems for large nice deltas. Example if a highly niced task
> was to be preempted for 1ms, and preempts instead at 3ms, then the
> less-niced task will not be so nice (even less nice than it promised
> to be) any more because of the 2ms boost that the higher niced task
> got. This can lead the the sched_latency thrown out of the window. Not
> adjusting the weights properly can potentially make that problem much
> worse IMO.
Once C releases the lock it should get adjusted and A will get
adjusted also regardless of tick. At the point we adjust the weights
we have a chance to check for preemption and cause a reschedule.
If C doesn't release the lock quickly (hopefully rare), it should
continue to run at the adjusted weight since it is still blocking A.
>
> Thanks.