Re: [PATCH sched_ext/for-6.13 1/2] sched_ext: Avoid live-locking bypass mode switching

From: Andrea Righi
Date: Tue Nov 05 2024 - 18:57:58 EST


Hi Tejun,

On Tue, Nov 05, 2024 at 11:48:11AM -1000, Tejun Heo wrote:
...
> +/*
> + * A poorly behaving BPF scheduler can live-lock the system by e.g. incessantly
> + * banging on the same DSQ on a large NUMA system to the point where switching
> + * to the bypass mode can take a long time. Inject artifical delays while the
> + * bypass mode is switching to guarantee timely completion.
> + */
> +static void scx_ops_breather(struct rq *rq)
> +{
> + u64 until;
> +
> + lockdep_assert_rq_held(rq);
> +
> + if (likely(!atomic_read(&scx_ops_breather_depth)))
> + return;
> +
> + raw_spin_rq_unlock(rq);
> +
> + until = ktime_get_ns() + NSEC_PER_MSEC;
> +
> + do {
> + int cnt = 1024;
> + while (atomic_read(&scx_ops_breather_depth) && --cnt)
> + cpu_relax();
> + } while (atomic_read(&scx_ops_breather_depth) &&
> + time_before64(ktime_get_ns(), until));

Do you think there's any benefit using the idle injection framework here
instead of this cpu_relax() loop? At the end we're trying to throttle
the scx scheduler from hammering a DSQ until the scheduler is kicked
out, so we may just inject real idle cycles?

-Andrea