Re: [External] Re: [PATCH] sched/core: Minor optimize pick_next_task() when core-sched enable

From: Hao Jia
Date: Fri Mar 24 2023 - 02:48:35 EST




On 2023/3/24 Vineeth Pillai wrote:
On Thu, Mar 23, 2023 at 3:03 AM Hao Jia <jiahao.os@xxxxxxxxxxxxx> wrote:

The other issue was - we don't update core rbtree when vruntime changes and
this can cause starvation of cookied task if there are more than one task with
the same cookie on an rq.


If I understand correctly, when a cookied task is enqueued, the
difference delta1 between its vruntime and min_vruntime is very large.

Another task with the same cookie is very actively dequeuing and
enqueuing, and the difference delta2 between its vruntime and
min_vruntime is always smaller than delta1?
I'm not sure if this is the case?

This case I was mentioning is about tasks that are continuously running
and hence always in the runqueue. sched_core_enqueue/dequeue is
not called and hence their position in the core rbtree is static while cfs
rbtree positions change as vruntime progresses.


Thanks for the detailed explanation.

BTW, this is a separate issue than the one you are targeting with this
fix. I just thought of mentioning it here as well..

Yeah, this is an absolute no-no, it makes the overhead of the second rb
tree unconditional.

I agree. Could we keep it conditional by enqueuing 0-cookied tasks only when
coresched is enabled, just like what we do for cookied tasks? This is still an
overhead where we have two trees storing all the runnable tasks but in
different order. We would also need to populate core rbtree from cfs rbtree
on coresched enable and empty the tree on coresched disable.


I'm not sure if the other way is reasonable, I'm trying to provide a
function for each scheduling class to find a highest priority non-cookie
task.

For example fair_sched_class, we can use rq->cfs_tasks to traverse the
search. But this search may take a long time, maybe we need to limit the
number of searches.

Yes, it can be time consuming based on the number of cgroups and tasks
that are runnable. You could probably take some performance numbers to
see how worse it is.

I agree, this can be very bad if there are a lot of tasks on rq. But using cfs rbtree to find the highest priority non-cookie task will become very complicated when CONFIG_FAIR_GROUP_SCHED is enabled.

Thanks,
Hao


We could also have some optimization like marking a runqueue having
non-cookied tasks and then do the search only if it is marked. I haven't
thought much about it, but search could be optimized hopefully.