Re: [PATCH v6 2/2] sched/uclamp: Protect uclamp fast path code with static key
From: Qais Yousef
Date: Tue Jun 30 2020 - 13:55:09 EST
On 06/30/20 19:07, Peter Zijlstra wrote:
> On Tue, Jun 30, 2020 at 12:21:23PM +0100, Qais Yousef wrote:
> > @@ -993,10 +1013,38 @@ static inline void uclamp_rq_dec_id(struct rq *rq, struct task_struct *p,
> >
> > lockdep_assert_held(&rq->lock);
> >
> > + /*
> > + * If sched_uclamp_used was enabled after task @p was enqueued,
> > + * we could end up with unbalanced call to uclamp_rq_dec_id().
> > + *
> > + * In this case the uc_se->active flag should be false since no uclamp
> > + * accounting was performed at enqueue time and we can just return
> > + * here.
> > + *
> > + * Need to be careful of the following enqeueue/dequeue ordering
> > + * problem too
> > + *
> > + * enqueue(taskA)
> > + * // sched_uclamp_used gets enabled
> > + * enqueue(taskB)
> > + * dequeue(taskA)
> > + * // Must not decrement bukcet->tasks here
> > + * dequeue(taskB)
> > + *
> > + * where we could end up with stale data in uc_se and
> > + * bucket[uc_se->bucket_id].
> > + *
> > + * The following check here eliminates the possibility of such race.
> > + */
> > + if (unlikely(!uc_se->active))
> > + return;
> > +
> > bucket = &uc_rq->bucket[uc_se->bucket_id];
> > +
> > SCHED_WARN_ON(!bucket->tasks);
> > if (likely(bucket->tasks))
> > bucket->tasks--;
> > +
> > uc_se->active = false;
> >
> > /*
>
> > @@ -1221,6 +1289,8 @@ static void __setscheduler_uclamp(struct task_struct *p,
> > if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
> > return;
> >
> > + static_branch_enable(&sched_uclamp_used);
> > +
> > if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
> > uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
> > attr->sched_util_min, true);
> > @@ -7387,6 +7457,8 @@ static ssize_t cpu_uclamp_write(struct kernfs_open_file *of, char *buf,
> > if (req.ret)
> > return req.ret;
> >
> > + static_branch_enable(&sched_uclamp_used);
> > +
> > mutex_lock(&uclamp_mutex);
> > rcu_read_lock();
> >
>
> There's a fun race described in 9107c89e269d ("perf: Fix race between
> event install and jump_labels"), are we sure this isn't also susceptible
> to something similar?
>
> I suspect not, but I just wanted to make sure.
IIUC, the worry is that not all CPUs might have observed the change in the
static key state; hence could not be running the patched
enqueue/dequeue_task(), so we could end up with some CPUs accounting for
uclamp in the enqueue/dequeue path but not others?
I was hoping this synchronization is guaranteed by the static_branch_*() call.
aarch64_insn_patch_text_nosync() in arch/arm64/kernel/insn.c performs
__flush_icache_range() after writing the new instruction.
I need to dig into what __flush_icache_range() do, but I think it'll just
flush the icache of the calling CPU. Need to dig into upper layers too.
It'd be good to know if I got you correctly first.
Thanks
--
Qais Yousef