Re: [PATCH 1/1] sched: Make schedstats a runtime tunable that is disabled by default v4

From: Mel Gorman
Date: Wed Feb 03 2016 - 06:39:20 EST


On Wed, Feb 03, 2016 at 12:28:49PM +0100, Ingo Molnar wrote:
>
> * Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
>
> > Changelog since v3
> > o Force enable stats during profiling and latencytop
> >
> > Changelog since V2
> > o Print stats that are not related to schedstat
> > o Reintroduce a static inline for update_stats_dequeue
> >
> > Changelog since V1
> > o Introduce schedstat_enabled and address Ingo's feedback
> > o More schedstat-only paths eliminated, particularly ttwu_stat
> >
> > schedstats is very useful during debugging and performance tuning but it
> > incurs overhead. As such, even though it can be disabled at build time,
> > it is often enabled as the information is useful. This patch adds a
> > kernel command-line and sysctl tunable to enable or disable schedstats on
> > demand. It is disabled by default as someone who knows they need it can
> > also learn to enable it when necessary.
> >
> > The benefits are workload-dependent but when it gets down to it, the
> > difference will be whether cache misses are incurred updating the shared
> > stats or not. [...]
>
> Hm, which shared stats are those?

Extremely poor phrasing on my part. The stats share a cache line and the
impact partially depends on whether unrelated stats share a cache line or
not during updates.

> I think we should really fix those as well:
> those shared stats should be percpu collected as well, with no extra cache misses
> in any scheduler fast path.
>

I looked into that but converting those stats to per-cpu counters would
incur sizable memory overhead. There are a *lot* of them and the basic
structure for the generic percpu-counter is

struct percpu_counter {
raw_spinlock_t lock;
s64 count;
#ifdef CONFIG_HOTPLUG_CPU
struct list_head list; /* All percpu_counters are on a list */
#endif
s32 __percpu *counters;
};

That's not taking the associated runtime overhead such as synchronising
them. Granted, some specialised implementation could be done for scheduler
but it would be massive overkill and maintenance overhead for stats that
most users do not even want.

--
Mel Gorman
SUSE Labs