Hello,
On Tue, Sep 24, 2024 at 09:10:02AM +0530, K Prateek Nayak wrote:
prev_state = READ_ONCE(prev->__state);
if (sched_mode == SM_IDLE) {
- if (!rq->nr_running) {
+ /* SCX must consult the BPF scheduler to tell if rq is empty */
I was wondering if sched_ext case could simply do:
if (scx_enabled())
prev_balance(rq, prev, rf);
and use "rq->scx.flags" to skip balancing in balance_scx() later when
__pick_next_task() calls prev_balance() but (and please correct me if
I'm wrong here) balance_scx() calls balance_one() which can call
consume_dispatch_q() to pick a task from global / user-defined dispatch
queue, and in doing so, it does not update "rq->nr_running".
Hmm... would that be a meaningful optimization? prev_balance() calls into
SCX's dispatch path and there can be quite a bit going on there. I'm not
sure whether it'd worth much to save a trip through __pick_next_task().
I could only see add_nr_running() being called from enqueue_task_scx()
and this is even before the ext core calls do_enqueue_task() which hooks
into the bpf layer which makes the decision where the task actually
goes.
Is my understanding correct that whichever CPU is the target for the
enqueue_task_scx() callback initially is the one that accounts the
enqueue in "rq->nr_running" until the task is dequeued or did I miss
something?
Whenever a task is dispatched to a local DSQ of a CPU including from
balance_one(), if the task is not on that CPU already,
move_remote_task_to_local_dsq() is called which migrates the task to the
target CPU by deactivating and then re-activating it. As deactivating and
re-activating involves dequeueing and re-enqueueing, rq->running gets
updated accordingly.
Thanks.