Re: [PATCH 14/17] sched/eevdf: Better handle mixed slice length

From: Mike Galbraith
Date: Wed Apr 05 2023 - 01:43:25 EST


On Tue, 2023-04-04 at 13:50 +0000, Joel Fernandes wrote:
> On Tue, Apr 04, 2023 at 11:29:36AM +0200, Peter Zijlstra wrote:
> > On Fri, Mar 31, 2023 at 05:26:51PM +0200, Vincent Guittot wrote:
> >
> > >
> > > Task A always run
> > > Task B loops on : running 1ms then sleeping 1ms
> > > default nice and latency nice prio bot both
> > > each task should get around 50% of the time.
> > >
> > > The fairness is ok with tip/sched/core
> > > but with eevdf, Task B only gets around 30%
> > >
> > > I haven't identified the problem so far
> >
> > Heh, this is actually the correct behaviour. If you have a u=1 and a
> > u=.5 task, you should distribute time on a 2:1 basis, eg. 67% vs 33%.
>
> Splitting like that sounds like starvation of the sleeper to me. If something
> sleeps a lot, it will get even less CPU time on an average than it would if
> there was no contention from the u=1 task.
>
> And also CGroups will be even more weird than it already is in such a world,
> 2 different containers will not get CPU time distributed properly- say if
> tasks in one container sleep a lot and tasks in another container are CPU
> bound.

Lets take a quick peek at some group distribution numbers.

start tbench and massive_intr in their own VT (autogroup), then in
another, sleep 300;killall top massive_intr tbench_srv tbench.

(caveman method because perf's refusing to handle fast switchers well
for me.. top's plenty good enough for this anyway, and less intrusive)

massive_intr runs 8ms, sleeps 1, wants 88.8% of 8 runqueues. tbench
buddy pairs want only a tad more CPU, 100% between them, but switch
orders of magnitude more frequently. Very dissimilar breeds of hog.

master.today accrued of 2400s vs master
team massive_intr 1120.50s .466 1.000
team tbench 1256.13s .523 1.000

+eevdf
team massive_intr 1071.94s .446 .956
team tbench 1301.56s .542 1.036

There is of course a distribution delta.. but was it meaningful?

Between mostly idle but kinda noisy GUI perturbing things, and more
importantly, neither load having been manually distributed and pinned,
both schedulers came out pretty good, and both a tad shy of.. perfect
is the enemy of good.

Raw numbers below in case my mouse mucked up feeding of numbers to bc
(blame the inanimate, they can't do a damn thing about it).

6.3.0.g148341f-master
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND
5641 root 20 0 2564 640 640 R 50.33 0.004 2:17.19 5 massive_intr
5636 root 20 0 2564 640 640 S 49.00 0.004 2:20.05 5 massive_intr
5647 root 20 0 2564 640 640 R 48.67 0.004 2:21.85 6 massive_intr
5640 root 20 0 2564 640 640 R 48.00 0.004 2:21.13 6 massive_intr
5645 root 20 0 2564 640 640 R 47.67 0.004 2:18.25 5 massive_intr
5638 root 20 0 2564 640 640 R 46.67 0.004 2:22.39 2 massive_intr
5634 root 20 0 2564 640 640 R 45.00 0.004 2:18.93 4 massive_intr
5643 root 20 0 2564 640 640 R 44.00 0.004 2:20.71 7 massive_intr
5639 root 20 0 23468 1664 1536 R 29.00 0.010 1:22.31 3 tbench
5644 root 20 0 23468 1792 1664 R 28.67 0.011 1:22.32 3 tbench
5637 root 20 0 23468 1664 1536 S 28.00 0.010 1:22.75 5 tbench
5631 root 20 0 23468 1792 1664 R 27.00 0.011 1:21.47 4 tbench
5632 root 20 0 23468 1536 1408 R 27.00 0.010 1:21.78 0 tbench
5653 root 20 0 6748 896 768 S 26.67 0.006 1:15.26 3 tbench_srv
5633 root 20 0 23468 1792 1664 R 26.33 0.011 1:22.53 0 tbench
5635 root 20 0 23468 1920 1792 R 26.33 0.012 1:20.72 7 tbench
5642 root 20 0 23468 1920 1792 R 26.00 0.012 1:21.73 2 tbench
5650 root 20 0 6748 768 768 R 25.67 0.005 1:15.71 1 tbench_srv
5652 root 20 0 6748 768 768 S 25.67 0.005 1:15.71 3 tbench_srv
5646 root 20 0 6748 768 768 S 25.33 0.005 1:14.97 4 tbench_srv
5648 root 20 0 6748 896 768 S 25.00 0.006 1:14.66 0 tbench_srv
5651 root 20 0 6748 896 768 S 24.67 0.006 1:14.79 2 tbench_srv
5654 root 20 0 6748 768 768 R 24.33 0.005 1:15.47 0 tbench_srv
5649 root 20 0 6748 768 768 R 24.00 0.005 1:13.95 7 tbench_srv

6.3.0.g148341f-master-eevdf
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND
10561 root 20 0 2564 768 640 R 49.83 0.005 2:14.86 3 massive_intr
10562 root 20 0 2564 768 640 R 49.50 0.005 2:14.00 3 massive_intr
10564 root 20 0 2564 896 768 R 49.50 0.006 2:14.11 6 massive_intr
10559 root 20 0 2564 768 640 R 47.84 0.005 2:14.03 2 massive_intr
10560 root 20 0 2564 768 640 R 45.51 0.005 2:13.92 7 massive_intr
10557 root 20 0 2564 896 768 R 44.85 0.006 2:13.59 7 massive_intr
10563 root 20 0 2564 896 768 R 44.85 0.006 2:13.53 6 massive_intr
10558 root 20 0 2564 768 640 R 43.52 0.005 2:13.90 2 massive_intr
10577 root 20 0 23468 1920 1792 R 35.22 0.012 1:37.06 0 tbench
10574 root 20 0 23468 1920 1792 R 32.23 0.012 1:32.89 4 tbench
10580 root 20 0 23468 1920 1792 R 29.57 0.012 1:34.95 0 tbench
10575 root 20 0 23468 1792 1664 R 29.24 0.011 1:31.66 4 tbench
10576 root 20 0 23468 1792 1664 S 28.57 0.011 1:34.55 5 tbench
10573 root 20 0 23468 1792 1664 R 28.24 0.011 1:33.17 5 tbench
10578 root 20 0 23468 1920 1792 S 28.24 0.012 1:33.97 1 tbench
10579 root 20 0 23468 1920 1792 R 28.24 0.012 1:36.09 1 tbench
10587 root 20 0 6748 768 640 S 26.91 0.005 1:09.45 0 tbench_srv
10582 root 20 0 6748 768 640 R 24.25 0.005 1:08.19 4 tbench_srv
10588 root 20 0 6748 640 640 R 22.59 0.004 1:09.15 0 tbench_srv
10583 root 20 0 6748 640 640 R 21.93 0.004 1:07.93 4 tbench_srv
10586 root 20 0 6748 640 640 S 21.59 0.004 1:07.92 1 tbench_srv
10581 root 20 0 6748 640 640 S 21.26 0.004 1:07.08 5 tbench_srv
10585 root 20 0 6748 640 640 R 21.26 0.004 1:08.89 5 tbench_srv
10584 root 20 0 6748 768 640 S 20.93 0.005 1:08.61 1 tbench_srv