Re: [PATCH] sched_ext: Provide a sysfs enable_seq counter
From: Phil Auld
Date: Mon Sep 23 2024 - 06:48:29 EST
Hi Andrea,
On Sat, Sep 21, 2024 at 09:39:21PM +0200 andrea.righi@xxxxxxxxx wrote:
> From: Andrea Righi <andrea.righi@xxxxxxxxx>
>
> As discussed during the distro-centric session within the sched_ext
> Microconference at LPC 2024, introduce a sequence counter that is
> incremented every time a BPF scheduler is loaded.
>
> This feature can help distributions in diagnosing potential performance
> regressions by identifying systems where users are running (or have ran)
> custom BPF schedulers.
>
> Example:
>
> arighi@virtme-ng~> cat /sys/kernel/sched_ext/enable_seq
> 0
> arighi@virtme-ng~> sudo scx_simple
> local=1 global=0
> ^CEXIT: unregistered from user space
> arighi@virtme-ng~> cat /sys/kernel/sched_ext/enable_seq
> 1
>
> In this way user-space tools (such as Ubuntu's apport and similar) are
> able to gather and include this information in bug reports.
>
> Cc: Giovanni Gherdovich <giovanni.gherdovich@xxxxxxxx>
> Cc: Kleber Sacilotto de Souza <kleber.souza@xxxxxxxxxxxxx>
> Cc: Marcelo Henrique Cerri <marcelo.cerri@xxxxxxxxxxxxx>
> Cc: Phil Auld <pauld@xxxxxxxxxx>
> Signed-off-by: Andrea Righi <andrea.righi@xxxxxxxxx>
Thanks for pulling this together. I am hopeful we can get it in
a 6.12-rc.
Reviewed-by: Phil Auld <pauld@xxxxxxxxxx>
Cheers,
Phil
> ---
> Documentation/scheduler/sched-ext.rst | 10 ++++++++++
> kernel/sched/ext.c | 17 +++++++++++++++++
> tools/sched_ext/scx_show_state.py | 1 +
> 3 files changed, 28 insertions(+)
>
> diff --git a/Documentation/scheduler/sched-ext.rst b/Documentation/scheduler/sched-ext.rst
> index a707d2181a77..6c0d70e2e27d 100644
> --- a/Documentation/scheduler/sched-ext.rst
> +++ b/Documentation/scheduler/sched-ext.rst
> @@ -83,6 +83,15 @@ The current status of the BPF scheduler can be determined as follows:
> # cat /sys/kernel/sched_ext/root/ops
> simple
>
> +You can check if any BPF scheduler has ever been loaded since boot by examining
> +this monotonically incrementing counter (a value of zero indicates that no BPF
> +scheduler has been loaded):
> +
> +.. code-block:: none
> +
> + # cat /sys/kernel/sched_ext/enable_seq
> + 1
> +
> ``tools/sched_ext/scx_show_state.py`` is a drgn script which shows more
> detailed information:
>
> @@ -96,6 +105,7 @@ detailed information:
> enable_state : enabled (2)
> bypass_depth : 0
> nr_rejected : 0
> + enable_seq : 1
>
> If ``CONFIG_SCHED_DEBUG`` is set, whether a given task is on sched_ext can
> be determined as follows:
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 9ee5a9a261cc..8057ab4c76da 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -874,6 +874,13 @@ static struct scx_exit_info *scx_exit_info;
> static atomic_long_t scx_nr_rejected = ATOMIC_LONG_INIT(0);
> static atomic_long_t scx_hotplug_seq = ATOMIC_LONG_INIT(0);
>
> +/*
> + * A monotically increasing sequence number that is incremented every time a
> + * scheduler is enabled. This can be used by to check if any custom sched_ext
> + * scheduler has ever been used in the system.
> + */
> +static atomic_long_t scx_enable_seq = ATOMIC_LONG_INIT(0);
> +
> /*
> * The maximum amount of time in jiffies that a task may be runnable without
> * being scheduled on a CPU. If this timeout is exceeded, it will trigger
> @@ -4154,11 +4161,19 @@ static ssize_t scx_attr_hotplug_seq_show(struct kobject *kobj,
> }
> SCX_ATTR(hotplug_seq);
>
> +static ssize_t scx_attr_enable_seq_show(struct kobject *kobj,
> + struct kobj_attribute *ka, char *buf)
> +{
> + return sysfs_emit(buf, "%ld\n", atomic_long_read(&scx_enable_seq));
> +}
> +SCX_ATTR(enable_seq);
> +
> static struct attribute *scx_global_attrs[] = {
> &scx_attr_state.attr,
> &scx_attr_switch_all.attr,
> &scx_attr_nr_rejected.attr,
> &scx_attr_hotplug_seq.attr,
> + &scx_attr_enable_seq.attr,
> NULL,
> };
>
> @@ -5176,6 +5191,8 @@ static int scx_ops_enable(struct sched_ext_ops *ops, struct bpf_link *link)
> kobject_uevent(scx_root_kobj, KOBJ_ADD);
> mutex_unlock(&scx_ops_enable_mutex);
>
> + atomic_long_inc(&scx_enable_seq);
> +
> return 0;
>
> err_del:
> diff --git a/tools/sched_ext/scx_show_state.py b/tools/sched_ext/scx_show_state.py
> index d457d2a74e1e..8bc626ede1c4 100644
> --- a/tools/sched_ext/scx_show_state.py
> +++ b/tools/sched_ext/scx_show_state.py
> @@ -37,3 +37,4 @@ print(f'switched_all : {read_static_key("__scx_switched_all")}')
> print(f'enable_state : {ops_state_str(enable_state)} ({enable_state})')
> print(f'bypass_depth : {read_atomic("scx_ops_bypass_depth")}')
> print(f'nr_rejected : {read_atomic("scx_nr_rejected")}')
> +print(f'enable_seq : {read_atomic("scx_enable_seq")}')
> --
> 2.46.0
>
>
--