Re: [PATCH v2 sched_ext/for-7.1] sched_ext: Invalidate dispatch decisions on CPU affinity changes

From: Kuba Piecuch

Date: Mon May 04 2026 - 04:05:42 EST


Hi Cheng-Yang,

On Fri May 1, 2026 at 4:19 PM UTC, Cheng-Yang Chou wrote:
>> >> 2. Do we want to restrict ourselves through the one qseq slot provided by
>> >> dsq_insert_begin()? The most flexible approach IMO would be to simply
>> >> allow BPF to read the qseq directly via a kfunc and then supply it to
>> >> dsq_insert() later. With this, we can have multiple qseqs saved at the
>> >> same time, and we can even pass them between CPUs, e.g. if one CPU
>> >> dequeues a task for a sibling CPU, but we want the checks to be made inside
>> >> the sibling's ops.dispatch() (I just made this use case it up, it may not
>> >> be practical.)
>> >> That said, exposing an internal thing like qseq to BPF may be a step too far.
>> >
>> > In Tejun's reply back in [1], he suggested dsq_insert_begin() precisely
>> > to avoid promoting qseq into the BPF ABI — which matches your own concern.
>> > The single per-CPU slot is sufficient for the one-task-per-iteration
>> > dispatch loops used by existing schedulers (e.g., scx_central).
>> > If a concrete cross-CPU use case materializes later, we can always extend
>> > dsq_insert() to accept an explicit qseq without breaking the current,
>> > simpler path.
>> >
>> > [1]: https://lore.kernel.org/all/acHJED4iAeytdC2l@xxxxxxxxxxxxxxx/
>> >
>>
>> Well, Tejun doesn't explicitly say there that he's against exposing qseq, but
>> I won't be surprised if he is.
>>
>> FWIW, ghOSt (our Google-internal BPF scheduling solution) uses exactly this
>> approach to guard the dispatch path against racing dequeues/enqueues.
>> Every task has a seqnum that gets incremented on each "event" pertaining to
>> the task. In the dispatch path, the BPF scheduler reads the task seqnum,
>> does whatever checks it needs to do, and passes the seqnum to ghOSt at the end.
>>
>> Admittedly, what works downstream doesn't have to work upstream, but I still
>> wanted to provide this data point :-)
>
> The ghOSt data point is appreciated. If a concrete use case emerges where
> the single-slot approach falls short, extending dsq_insert() to accept an
> explicit qseq seems like a natural next step.
>
> Tejun, Andrea, sched-ext folks, any preferences?

Random thought: If exposing qseq values to BPF directly is undesirable, then
perhaps a less objectionable approach would be to expose them as opaque
cookie/token values? Same semantics, but fewer SCX internals leaking to BPF.

Thanks,
Kuba