Re: [PATCH 1/1] sched: Make schedstats a runtime tunable that is disabled by default v3
From: Mel Gorman
Date: Tue Feb 02 2016 - 06:58:24 EST
On Tue, Feb 02, 2016 at 03:02:07PM +0530, Srikar Dronamraju wrote:
>
> Dont we need changes for sched_info_on()?
>
> If we disable schedstats dynamically but CONFIG_TASK_DELAY_ACCT is not
> set, then sched_info_on will return true,
Is this really a problem?
sched_info_on() guards sched_info_dequeued(), sched_info_queued(),
__sched_info_switch(). These update fields in the sched_info struct with
the exception of rq->rq_cpu_time. In the case of rq_cpu_time, the values
it's updated depend on sched_info.
I'm not spotting the case where the current information for delayacct is
inaccurate. Where is it? Granted, there is some scope for also disabling
the delayacct information unless explicitly enabled.
> This could impact guest steal
> time stats as well as data read from /proc/<PID>/schedstat
>
> Also when schedstats is dynamically disabled, and user tries to enable
> kernel sleep profiling profile_setup(), the kernel may not be able to do
> the right profiling since enqueue_sleeper() may not get called. Should
> we alert the user saying kernel sleep profiling is disabled?
>
Yes. This on top? It's not completely bullet proof as a user could both
force schedstat disabled and enable sleep profiling but it's a waste of
memory to guard against it
diff --git a/include/linux/sched.h b/include/linux/sched.h
index a10494a94cc3..5c2cd37c42e9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -920,6 +920,14 @@ static inline int sched_info_on(void)
#endif
}
+#ifdef CONFIG_SCHEDSTATS
+void force_schedstat_enabled(void);
+#else
+static inline void force_schedstat_enabled(void)
+{
+}
+#endif
+
enum cpu_idle_type {
CPU_IDLE,
CPU_NOT_IDLE,
diff --git a/kernel/profile.c b/kernel/profile.c
index 99513e1160e5..51369697466e 100644
--- a/kernel/profile.c
+++ b/kernel/profile.c
@@ -59,6 +59,7 @@ int profile_setup(char *str)
if (!strncmp(str, sleepstr, strlen(sleepstr))) {
#ifdef CONFIG_SCHEDSTATS
+ force_schedstat_enabled();
prof_on = SLEEP_PROFILING;
if (str[strlen(sleepstr)] == ',')
str += strlen(sleepstr) + 1;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1a816ebaa7da..f7aff8386c3f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2295,6 +2295,14 @@ static void set_schedstats(bool enabled)
static_branch_disable(&sched_schedstats);
}
+void force_schedstat_enabled(void)
+{
+ if (!schedstat_enabled()) {
+ pr_info("kernel sleep profiling force enabled sched_schedstats\n");
+ static_branch_enable(&sched_schedstats);
+ }
+}
+
static int __init setup_schedstats(char *str)
{
int ret = 0;
--
Mel Gorman
SUSE Labs