Re: [PATCHSET sched_ext/for-7.1] sched_ext: Implement SCX_ENQ_IMMED

Next message: kernel test robot: "Re: [PATCH v2 4/9] bus: mhi: Centralize firmware image table selection at probe time"
Previous message: Willy Tarreau: "Re: [PATCH] tools/nolibc: MIPS: fix clobbers of 'lo' and 'hi' registers on different ISAs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Andrea Righi

Date: Sun Mar 08 2026 - 04:54:39 EST

On Sat, Mar 07, 2026 at 02:19:46PM -1000, Tejun Heo wrote:
> Hello,
>
> On Sat, Mar 07, 2026 at 11:36:46PM +0100, Andrea Righi wrote:
> > This looks interesting, but I'm trying to understand the typical use case
> > of this feature.
> >
> > I agree that we need some kernel support to "atomically" determine when a
> > CPU is available (it can't be done fully in BPF). Initially I thought the
> > main target for ENQ_IMMED was to improve latency-sensitive workloads, but
> > this actually hurts latency, due to the additional re-enqueue cost and in
> > this case it might be better to be "less perfect" and not use ENQ_IMMED.
>
> I don't see how it'd worsen latency. You atomically get the CPU or not. If
> you don't, the only thing you can do is reenqueueing to find an alternate
> cpu if available. If you don't do that, the task would end up waiting for
> the CPU which is now busy doing something else to open up in the local DSQ.

Yeah, I need to do more tests with this. I did a quick test with scx_cosmos
enabling SCX_OPS_ALWAYS_ENQ_IMMED and in ops.enqueue() taking the migration
attempt path (find another idle CPU) when SCX_ENQ_REENQ is set, and I'm
noticing 5-10% regression in avg fps / tail latency.

Maybe that's a too simplistic solution. I think with the global
SCX_OPS_ALWAYS_ENQ_IMMED I may end up skipping some direct dispatches from
ops.select_cpu(), that is probably what is hurting performance in my case,
but it's just a guess, I'll investigate more.

Thanks,
-Andrea