On 04/11/24 17:41, Joel Fernandes wrote:
On Mon, Nov 04, 2024 at 11:54:36AM +0100, Juri Lelli wrote:...
So, I played a little bit with it and came up with a simple set of opsI added a printk in __dl_server_attach_root which is called after theYou nailed it! Our 5.15 backport appears to be slightly older and is missing
dynamic rd is built to transfer bandwidth to it.
__dl_server_attach_root came with d741f297bceaf ("sched/fair: Fair
server interface"), do you have this change in your backport?
this from topology.c as you mentioned. Thanks for clarifying!
/*
* Because the rq is not a task, dl_add_task_root_domain() did not
* move the fair server bw to the rd if it already started.
* Add it now.
*/
if (rq->fair_server.dl_server)
__dl_server_attach_root(&rq->fair_server, rq);
Sounds good, that would be good to verify.So if rd changes during boot initialization, the correct dl_bw has to becpusets changes are something that I still need to double check. Will
updated AFAICS. Also if cpusets are used, the rd for a CPU may change.
do.
that point out an issue (default fedora server install):
echo Y >/sys/kernel/debug/sched/verbose
echo +cpuset >/sys/fs/cgroup/cgroup.subtree_control
echo 0-7 > /sys/fs/cgroup/user.slice/cpuset.cpus
echo 6-7 > /sys/fs/cgroup/user.slice/cpuset.cpus.exclusive
echo root >/sys/fs/cgroup/user.slice/cpuset.cpus.partition
The domains are rebuilt correctly, but we end up with a null total_bw.
The conditional call above takes care correctly of adding back dl_server
per-rq bandwidth when we pass from the single domain to the 2 exclusive
ones, but I noticed that we go through partition_sched_domains_locked()
twice for a single write of 'root' and the second one, since it's not
actually destroying/rebuilding anything, is resetting total_bw w/o
addition dl_server contribution back.
Now, not completely sure why we need to go through partition_sched_
domains_locked() twice, as we have (it also looked like a pattern from
other call paths)
update_prstate()
-> update_cpumasks_hier()
-> rebuild_sched_domains_locked() <- right at the end
-> update_partition_sd_lb()
-> rebuild_sched_domains_locked() <- right after the above call
Removing the first call does indeed fix the issue and domains look OK,
but I'm pretty sure I'm missing all sort of details and corner cases.
Waiman (now Cc-ed), maybe you can help here understanding why the two
back to back calls are needed?