Re: [PATCH 5/5] sched: limit sched_slice if it is more thansysctl_sched_latency

From: Joonsoo Kim
Date: Mon Apr 01 2013 - 01:09:23 EST


Hello Preeti.

On Fri, Mar 29, 2013 at 05:05:37PM +0530, Preeti U Murthy wrote:
> Hi Joonsoo
>
> On 03/28/2013 01:28 PM, Joonsoo Kim wrote:
> > sched_slice() compute ideal runtime slice. If there are many tasks
> > in cfs_rq, period for this cfs_rq is extended to guarantee that each task
> > has time slice at least, sched_min_granularity. And then each task get
> > a portion of this period for it. If there is a task which have much larger
> > load weight than others, a portion of period can exceed far more than
> > sysctl_sched_latency.
>
> Correct. But that does not matter, the length of the scheduling latency
> period is determined by the return value of ___sched_period(), not the
> value of sysctl_sched_latency. You would not extend the period,if you
> wanted all tasks to have a slice within the sysctl_sched_latency, right?
>
> So since the value of the length of the scheduling latency period, is
> dynamic depending on the number of the processes running, the
> sysctl_sched_latency which is the default latency period length is not
> mesed with, but is only used as a base to determine the actual
> scheduling period.
>
> >
> > For exampple, you can simply imagine that one task with nice -20 and
> > 9 tasks with nice 0 on one cfs_rq. In this case, load weight sum for
> > this cfs_rq is 88761 + 9 * 1024, 97977. So a portion of slice for the
> > task with nice -20 is sysctl_sched_min_granularity * 10 * (88761 / 97977),
> > that is, approximately, sysctl_sched_min_granularity * 9. This aspect
> > can be much larger if there is more tasks with nice 0.
>
> Yeah so the __sched_period says that within 40ms, all tasks need to be
> scheduled ateast once, and the highest priority task gets nearly 36ms of
> it, while the rest is distributed among the others.
>
> >
> > So we should limit this possible weird situation.
> >
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index e232421..6ceffbc 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -645,6 +645,9 @@ static u64 sched_slice(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > }
> > slice = calc_delta_mine(slice, se->load.weight, load);
> >
> > + if (unlikely(slice > sysctl_sched_latency))
> > + slice = sysctl_sched_latency;
>
> Then in this case the highest priority thread would get
> 20ms(sysctl_sched_latency), and the rest would get
> sysctl_sched_min_granularity * 10 * (1024/97977) which would be 0.4ms.
> Then all tasks would get scheduled ateast once within 20ms + (0.4*9) ms
> = 23.7ms, while your scheduling latency period was extended to 40ms,just
> so that each of these tasks don't have their sched_slices shrunk due to
> large number of tasks.

I don't know I understand your question correctly.
I will do my best to answer your comment. :)

With this patch, I just limit maximum slice at one time. Scheduling is
controlled through the vruntime. So, in this case, the task with nice -20
will be scheduled twice.

20 + (0.4 * 9) + 20 = 43.9 ms

And after 43.9 ms, this process is repeated.

So I can tell you that scheduling period is preserved as before.

If we give a long period to a task at one go, it can cause
a latency problem. So IMHO, limiting this is meaningful.

Thanks.

>
> > +
> > return slice;
> > }
> >
>
> Regards
> Preeti U Murthy
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/