Re: [RFC PATCH] sched, cgroup: cgroup1 can also take the non-RUNTIME_INF min
From: Phil Auld
Date: Tue Sep 10 2024 - 08:12:24 EST
On Tue, Sep 10, 2024 at 07:13:32PM +0800 刘嵩 wrote:
>
>
> > 2024年9月10日 18:49,Phil Auld <pauld@xxxxxxxxxx> 写道:
> >
> >
> > Hi,
> >
> > On Tue, Sep 10, 2024 at 03:48:32PM +0800 Liu Song wrote:
> >> For the handling logic of child_quota, there is no need to distinguish
> >> between cgroup1 and cgroup2, so unify the handling logic here.
> >>
> >> Signed-off-by: Liu Song <liusong@xxxxxxxxxxxxxxxxx>
> >> ---
> >> kernel/sched/core.c | 21 +++++----------------
> >> 1 file changed, 5 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> >> index e752146e59a4..8418c67faa69 100644
> >> --- a/kernel/sched/core.c
> >> +++ b/kernel/sched/core.c
> >> @@ -9501,23 +9501,12 @@ static int tg_cfs_schedulable_down(struct task_group *tg, void *data)
> >> parent_quota = parent_b->hierarchical_quota;
> >>
> >> /*
> >> - * Ensure max(child_quota) <= parent_quota. On cgroup2,
> >> - * always take the non-RUNTIME_INF min. On cgroup1, only
> >> - * inherit when no limit is set. In both cases this is used
> >> - * by the scheduler to determine if a given CFS task has a
> >> - * bandwidth constraint at some higher level.
> >
> > This comment is here for a reason. Please don't remove it.
>
> Hi
>
> I don’t see why cgroup1 needs to impose this restriction while cgroup2
> can directly take the non-RUNTIME_INF minimum value. What is the
> necessity of this?
>
That's how cgroupv1 bandwidth control is defined. See
Documentation/scheduler/sched-bcw.rst.
> It seems more reasonable to unify the handling logic. Even if the child
> group quota exceeds the parent group quota, it would not actually take
> effect.
>
It's not about it taking effect or not. You are not supposed to be
allowed to configure a child quota > parent quota. It's supposed to
be an error.
Also, my comment about the comment specifically is that last sentence, which
explains that other parts of the scheduler rely on this being set correctly,
needs to remain. But since I don't think this change is right, that should
not be an issue.
Cheers,
Phil
> However, if the parent group quota is reset to a larger value, then the
> child group quota would have actual significance. Therefore, the handling
> logic should be consistent between cgroup1 and cgroup2.
>
> Thanks
>
>
> >
> >> + * Ensure max(child_quota) <= parent_quota.
> >> */
> >> - if (cgroup_subsys_on_dfl(cpu_cgrp_subsys)) {
> >> - if (quota == RUNTIME_INF)
> >> - quota = parent_quota;
> >> - else if (parent_quota != RUNTIME_INF)
> >> - quota = min(quota, parent_quota);
> >> - } else {
> >> - if (quota == RUNTIME_INF)
> >> - quota = parent_quota;
> >> - else if (parent_quota != RUNTIME_INF && quota > parent_quota)
> >> - return -EINVAL;
> >> - }
> >> + if (quota == RUNTIME_INF)
> >> + quota = parent_quota;
> >> + else if (parent_quota != RUNTIME_INF)
> >> + quota = min(quota, parent_quota);
> >> }
> >> cfs_b->hierarchical_quota = quota;
> >>
> >
> > I don't think there is a need to optimize this slow path
> > to allow setting invalid values which have to be handled in
> > fast paths. And this will change expected behavior.
> >
> > So NAK.
> >
> > Cheers,
> > Phil
> >
> > --
>
--