Re: [PATCH v2] sched: let __sched_period() use rq's nr_running

From: Byungchul Park
Date: Sun Jul 12 2015 - 21:03:24 EST


On Fri, Jul 10, 2015 at 02:31:10PM +0100, Morten Rasmussen wrote:
> On Fri, Jul 10, 2015 at 05:11:30PM +0900, byungchul.park@xxxxxxx wrote:
> > From: Byungchul Park <byungchul.park@xxxxxxx>
> >
> > __sched_period() returns a period which a rq can have. the period has to be
> > stretched by the number of task *the rq has*, when nr_running > nr_latency.
> > otherwise, task slice can be very smaller than sysctl_sched_min_granularity
> > depending on the position of tg hierarchy when CONFIG_FAIR_GROUP_SCHED.
> >
> > Signed-off-by: Byungchul Park <byungchul.park@xxxxxxx>
> > ---
> > kernel/sched/fair.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 09456fc..8ae7aeb 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -635,7 +635,7 @@ static u64 __sched_period(unsigned long nr_running)
> > */
> > static u64 sched_slice(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > {
> > - u64 slice = __sched_period(cfs_rq->nr_running + !se->on_rq);
> > + u64 slice = __sched_period(rq_of(cfs_rq)->nr_running + !se->on_rq);

hello,

>
> This would stretch the period to fit rq->cfs.h_nr_running (which is
> equal to rq.nr_running), but I still think that the slice may be smaller
> than sched_min_granularity for low priority tasks since the slice is

yes, i also think the slice may be smaller than sched_min_granularity for
low priority tasks, while the slice may be larger than sched_min_granularity
for high priority tasks. and as you may know, the slice is already scaled
by priority in sched_slice().

in order to scale the slice properly in sched_slice(), __sched_period()
should return rq wide period. or i think we should change other code
assuming that variables like sysctl_sched_min_granularity are comparable
to a task execution time which is independant with position of cgroup
hierarch. for example, see check_preempt_tick()..

> scaled by priority.
>
> Also, I'm not sure if we want to enforce sched_slice >=
> sched_min_granularity, it would mean that tasks inside task groups can
> stretch the overall period and increase latency for non-grouped tasks.

we don't need to enforce sched_slice >= sched_min_granularity. i am just
saying that rq wide period should be stretched with rq wide nr_number with
which sched_slice() calculate actual task's slice later.

and i agree with that it makes latency increase for non-grouped tasks.
to prevent it, IMHO, we need to fix how to calculate it. however, when
getting *rq wide* period, stretching with local cfq's nr_number looks weird.

what do you think?

thank you,
byungchul

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/