MAEDA Naoaki wrote:ok, so basically the bit in cpu_rc_load() where for_each_cpu_mask() isThank you for looking the code.
called, in Maeda Naoaki's patch "CPU controller - Add class load
estimation support", is where O(N) creeps in that could be remedied with
a token bucket algorithm. You don't want this because if you have 10,000
processes on a system in two resource groups, the aggregate performance
will suffer due to the large number of cacheline misses during the 5,000
size loop that runs every resched.
cpu_rc_load() is never called unless sysadm tries to access the load
information via configfs from userland. In addition, it sums up per-CPU
group stats, so the size of loop is the number of CPU, not process in
the group.
However, there is a similer loop in cpu_rc_recalc_tsfactor(), which runs
every CPU_RC_RECALC_INTERVAL that is defined as HZ. I don't think it
will cause a big performance penalty.
Ok, so that's not as bad as it looked. So, while it is still O(N), the
fact that it is O(N/HZ) makes this not a problem until you get to
possibly impractical levels of runqueue length.
I'm thinking it's probably worth doing anyway, just so that it can be
performance tested to see if this performance guestimate is accurate.
To apply the token bucket here, you would first change the per-CPUCurrent 0(1) scheduler gives extra bonus for interactive tasks by
struct cpu_rc to have the TBF fields; minimally:
[...]
I think that the characteristics of these two approaches are subtly
different. Both scale timeslices, but in a different way - instead of
estimating the load and scaling back timeslices up front, busy Resource
Groups are relied on to deplete their tokens in a timely manner, and get
shorter slices allocated because of that. No doubt from 10,000 feet they
both look the same.
requeuing them to active array for a while. It would break
the controller's efforts. So, I'm planning to stop the interactive
task requeuing if the target share doesn't meet.
Are there a similar issue on the vserver scheduler?
Not an issue - those extra requeued timeslices are accounted for normally.