Re: [RFC] sched/fair: Use wake_q length as a hint for wake_wide

From: Brendan Jackman
Date: Wed Sep 20 2017 - 05:41:20 EST


Hi Joel,

Sorry I didn't see your comments on the code before, I think it's
orthoganal to the other thread about the overall design so I'll just
respond here.

On Tue, Sep 19 2017 at 05:15, Joel Fernandes wrote:
> Hi Brendan,
>
> On Fri, Aug 11, 2017 at 2:45 AM, Brendan Jackman

[snip]

>>
>> diff --git a/include/linux/sched/wake_q.h b/include/linux/sched/wake_q.h
>> index d03d8a9047dc..607a888eb35b 100644
>> --- a/include/linux/sched/wake_q.h
>> +++ b/include/linux/sched/wake_q.h
>> @@ -33,6 +33,7 @@
>> struct wake_q_head {
>> struct wake_q_node *first;
>> struct wake_q_node **lastp;
>> + int count;
>> };
>>
>> #define WAKE_Q_TAIL ((struct wake_q_node *) 0x01)
>> @@ -44,6 +45,7 @@ static inline void wake_q_init(struct wake_q_head *head)
>> {
>> head->first = WAKE_Q_TAIL;
>> head->lastp = &head->first;
>> + head->count = 0;
>> }
>>
>> extern void wake_q_add(struct wake_q_head *head,
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 0869b20fba81..ddf9257b1467 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -438,6 +438,8 @@ void wake_q_add(struct wake_q_head *head, struct task_struct *task)
>> if (cmpxchg(&node->next, NULL, WAKE_Q_TAIL))
>> return;
>>
>> + head->count++;
>> +
>> get_task_struct(task);
>>
>> /*
>> @@ -447,6 +449,10 @@ void wake_q_add(struct wake_q_head *head, struct task_struct *task)
>> head->lastp = &node->next;
>> }
>>
>> +static int
>> +try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags,
>> + int sibling_count_hint);
>> +
>> void wake_up_q(struct wake_q_head *head)
>> {
>> struct wake_q_node *node = head->first;
>> @@ -461,10 +467,10 @@ void wake_up_q(struct wake_q_head *head)
>> task->wake_q.next = NULL;
>>
>> /*
>> - * wake_up_process() implies a wmb() to pair with the queueing
>> + * try_to_wake_up() implies a wmb() to pair with the queueing
>> * in wake_q_add() so as not to miss wakeups.
>> */
>> - wake_up_process(task);
>> + try_to_wake_up(task, TASK_NORMAL, 0, head->count);
>> put_task_struct(task);
>
> Shouldn't you reset head->count after all the tasks have been woken up?

That's done in wake_q_init, which should be enough as you only use a
wake_q once per init [1]

[1] http://elixir.free-electrons.com/linux/latest/source/include/linux/sched/wake_q.h#L33

>
>> }
>> }
>> @@ -1527,12 +1533,14 @@ static int select_fallback_rq(int cpu, struct task_struct *p)
>> * The caller (fork, wakeup) owns p->pi_lock, ->cpus_allowed is stable.
>> */
>> static inline
>> -int select_task_rq(struct task_struct *p, int cpu, int sd_flags, int wake_flags)
>> +int select_task_rq(struct task_struct *p, int cpu, int sd_flags, int wake_flags,
>> + int sibling_count_hint)
>
> This variable seems a bit long to me, how about just sibling_count?

Yeah, shortening sounds good. Coming back with fresh eyes I think
'sibling' is a bad description, I was thinking of siblings in the
waker/wakee graph of tasks actually but I don't think that's obvious at
all. This is just an RFC so if it ever makes it to PATCH I'll try and
think of something better.

>> {
>> lockdep_assert_held(&p->pi_lock);
>>
>> if (p->nr_cpus_allowed > 1)
>> - cpu = p->sched_class->select_task_rq(p, cpu, sd_flags, wake_flags);
>> + cpu = p->sched_class->select_task_rq(p, cpu, sd_flags, wake_flags,
>> + sibling_count_hint);
>
> same.
>
> <snip>
>
>>
>> static int
>> -select_task_rq_rt(struct task_struct *p, int cpu, int sd_flag, int flags)
>> +select_task_rq_rt(struct task_struct *p, int cpu, int sd_flag, int flags,
>> + int sibling_count_hint)
>> {
>> struct task_struct *curr;
>> struct rq *rq;
>> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>> index eeef1a3086d1..56ae525618e9 100644
>> --- a/kernel/sched/sched.h
>> +++ b/kernel/sched/sched.h
>> @@ -1419,7 +1419,8 @@ struct sched_class {
>> void (*put_prev_task) (struct rq *rq, struct task_struct *p);
>>
>> #ifdef CONFIG_SMP
>> - int (*select_task_rq)(struct task_struct *p, int task_cpu, int sd_flag, int flags);
>> + int (*select_task_rq)(struct task_struct *p, int task_cpu, int sd_flag, int flags,
>> + int subling_count_hint);
>
> s/subling/sibling/

Yup :|

Thanks,
Brendan