Re: [tip:sched/core] sched: Re-tune the scheduler latency defaults to decrease worst-case latencies

From: Martin Steigerwald
Date: Sat Sep 12 2009 - 07:45:38 EST


Am Mittwoch 09 September 2009 schrieb tip-bot for Mike Galbraith:
> Commit-ID: 172e082a9111ea504ee34cbba26284a5ebdc53a7
> Gitweb:
> http://git.kernel.org/tip/172e082a9111ea504ee34cbba26284a5ebdc53a7
> Author: Mike Galbraith <efault@xxxxxx>
> AuthorDate: Wed, 9 Sep 2009 15:41:37 +0200
> Committer: Ingo Molnar <mingo@xxxxxxx>
> CommitDate: Wed, 9 Sep 2009 17:30:06 +0200
>
> sched: Re-tune the scheduler latency defaults to decrease worst-case
> latencies
>
> Reduce the latency target from 20 msecs to 5 msecs.
>
> Why? Larger latencies increase spread, which is good for scaling,
> but bad for worst case latency.
>
> We still have the ilog(nr_cpus) rule to scale up on bigger
> server boxes.
>
> Signed-off-by: Mike Galbraith <efault@xxxxxx>
> Acked-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> LKML-Reference: <1252486344.28645.18.camel@xxxxxxxxxxxxxxxx>
> Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
>
>
> ---
> kernel/sched_fair.c | 12 ++++++------
> 1 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index af325a3..26fadb4 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -24,7 +24,7 @@
>
> /*
> * Targeted preemption latency for CPU-bound tasks:
> - * (default: 20ms * (1 + ilog(ncpus)), units: nanoseconds)
> + * (default: 5ms * (1 + ilog(ncpus)), units: nanoseconds)
> *
> * NOTE: this latency value is not the same as the concept of
> * 'timeslice length' - timeslices in CFS are of variable length
> @@ -34,13 +34,13 @@
> * (to see the precise effective timeslice length of your workload,
> * run vmstat and monitor the context-switches (cs) field)
> */
> -unsigned int sysctl_sched_latency = 20000000ULL;
> +unsigned int sysctl_sched_latency = 5000000ULL;
>
> /*
> * Minimal preemption granularity for CPU-bound tasks:
> - * (default: 4 msec * (1 + ilog(ncpus)), units: nanoseconds)
> + * (default: 1 msec * (1 + ilog(ncpus)), units: nanoseconds)
> */
> -unsigned int sysctl_sched_min_granularity = 4000000ULL;
> +unsigned int sysctl_sched_min_granularity = 1000000ULL;

Needs to be lower for a fluid desktop experience here:

shambhala:/proc/sys/kernel> cat sched_min_granularity_ns
100000

>
> /*
> * is kept at sysctl_sched_latency / sysctl_sched_min_granularity
> @@ -63,13 +63,13 @@ unsigned int __read_mostly
> sysctl_sched_compat_yield;
>
> /*
> * SCHED_OTHER wake-up granularity.
> - * (default: 5 msec * (1 + ilog(ncpus)), units: nanoseconds)
> + * (default: 1 msec * (1 + ilog(ncpus)), units: nanoseconds)
> *
> * This option delays the preemption effects of decoupled workloads
> * and reduces their over-scheduling. Synchronous workloads will still
> * have immediate wakeup/sleep latencies.
> */
> -unsigned int sysctl_sched_wakeup_granularity = 5000000UL;
> +unsigned int sysctl_sched_wakeup_granularity = 1000000UL;

Dito:

shambhala:/proc/sys/kernel> cat sched_wakeup_granularity_ns
100000

With

shambhala:~> cat /proc/version
Linux version 2.6.31-rc7-tp42-toi-3.0.1-04741-g57e61c0 (martin@shambhala)
(gcc version 4.3.3 (Debian 4.3.3-10) ) #6 PREEMPT Sun Aug 23 10:51:32 CEST
2009

on my ThinkPad T42.

Otherwise compositing animations like switching desktops and zooming in
newly opening windows still appear jerky. Even with:

shambhala:/sys/kernel/debug> cat sched_features
NO_NEW_FAIR_SLEEPERS NO_NORMALIZED_SLEEPER ADAPTIVE_GRAN WAKEUP_PREEMPT
START_DEBIT AFFINE_WAKEUPS CACHE_HOT_BUDDY SYNC_WAKEUPS NO_HRTICK
NO_DOUBLE_TICK ASYM_GRAN LB_BIAS LB_WAKEUP_UPDATE ASYM_EFF_LOAD
NO_WAKEUP_OVERLAP LAST_BUDDY OWNER_SPIN

But NO_NEW_FAIR_SLEEPERS also gives a benefit. It makes those animation
even more fluent.

In complete I am quity happy with

shambhala:/proc/sys/kernel> grep "" *sched*
sched_child_runs_first:0
sched_compat_yield:0
sched_features:113916
sched_latency_ns:5000000
sched_migration_cost:500000
sched_min_granularity_ns:100000
sched_nr_migrate:32
sched_rt_period_us:1000000
sched_rt_runtime_us:950000
sched_shares_ratelimit:250000
sched_shares_thresh:4
sched_wakeup_granularity_ns:100000

for now.

It really makes a *lot* of difference. But it appears that both
sched_min_granularity_ns and sched_wakeup_granularity_ns have to be lower
on my ThinkPad for best effect.

I would still prefer some autotuning, where I say "desktop!" or nothing at
all. And thats it.

Ciao,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7

Attachment: signature.asc
Description: This is a digitally signed message part.