Re: [PATCH v6 1/2] sched/uclamp: Add a new sysctl to control RT default boost value
From: Valentin Schneider
Date: Mon Jul 06 2020 - 11:49:33 EST
On 06/07/20 15:28, Qais Yousef wrote:
> CC: linux-fsdevel@xxxxxxxxxxxxxxx
> ---
>
> Peter
>
> I didn't do the
>
> read_lock(&taslist_lock);
> smp_mb__after_spinlock();
> read_unlock(&tasklist_lock);
>
> dance you suggested on IRC as it didn't seem necessary. But maybe I missed
> something.
>
So the annoying bit with just uclamp_fork() is that it happens *before* the
task is appended to the tasklist. This means without too much care we
would have (if we'd do a sync at uclamp_fork()):
CPU0 (sysctl write) CPU1 (concurrent forker)
copy_process()
uclamp_fork()
p.uclamp_min = state
state = foo
for_each_process_thread(p, t)
update_state(t);
list_add(p)
i.e. that newly forked process would entirely sidestep the update. Now,
with Peter's suggested approach we can be in a much better situation. If we
have this in the sysctl update:
state = foo;
read_lock(&taslist_lock);
smp_mb__after_spinlock();
read_unlock(&tasklist_lock);
for_each_process_thread(p, t)
update_state(t);
While having this in the fork:
write_lock(&tasklist_lock);
list_add(p);
write_unlock(&tasklist_lock);
sched_post_fork(p); // state re-read here; probably wants an mb first
Then we can no longer miss an update. If the forked p doesn't see the new
value, it *must* have been added to the tasklist before the updater loops
over it, so the loop will catch it. If it sees the new value, we're done.
AIUI, the above strategy doesn't require any use of RCU. The update_state()
and sched_post_fork() can race, but as per the above they should both be
writing the same value.