Re: [RFC PATCH v7 11/23] sched/fair: core wide cfs task priority comparison

From: Joel Fernandes
Date: Tue Sep 22 2020 - 21:52:47 EST


On Tue, Sep 22, 2020 at 09:46:22PM -0400, Joel Fernandes wrote:
> On Fri, Aug 28, 2020 at 11:29:27PM +0200, Peter Zijlstra wrote:
> >
> >
> > This is still a horrible patch..
>
> Hi Peter,
> I wrote a new patch similar to this one and it fares much better in my tests,
> it is based on Aaron's idea but I do the sync only during force-idle, and not
> during enqueue. Also I yanked the whole 'core wide min_vruntime' crap. There
> is a regressing test which improves quite a bit with my patch (results below):
>
> Aaron, Vineeth, Chris any other thoughts? This patch is based on Google's
> 4.19 device kernel so will require some massaging to apply to mainline/v7
> series. I will provide an updated patch later based on v7 series.
>
> (Works only for SMT2, maybe we can generalize it more..)
> --------8<-----------
>
> From: "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx>
> Subject: [PATCH] sched: Sync the min_vruntime of cores when the system enters
> force-idle
>
> This patch provides a vruntime based way to compare two cfs task's priority, be
> it on the same cpu or different threads of the same core.
>
> It is based on Aaron Lu's patch with some important differences. Namely,
> the vruntime is sync'ed only when the CPU goes into force-idle. Also I removed
> the notion of core-wide min_vruntime.
>
> Also I don't care how long a cpu in a core is force idled, I do my sync
> whenever the force idle starts essentially bringing both SMTs to a common time
> base. After that point, selection can happen as usual.
>
> When running an Android audio test, with patch the perf sched latency output:
>
> -----------------------------------------------------------------------------------------------------------------
> Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | Maximum delay at |
> -----------------------------------------------------------------------------------------------------------------
> FinalizerDaemon:(2) | 23.969 ms | 969 | avg: 0.504 ms | max: 162.020 ms | max at: 1294.327339 s
> HeapTaskDaemon:(3) | 2421.287 ms | 4733 | avg: 0.131 ms | max: 96.229 ms | max at: 1302.343366 s
> adbd:(3) | 6.101 ms | 79 | avg: 1.105 ms | max: 84.923 ms | max at: 1294.431284 s
>
> Without this patch and with Aubrey's initial patch (in v5 series), the max delay looks much better:
>
> -----------------------------------------------------------------------------------------------------------------
> Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | Maximum delay at |
> -----------------------------------------------------------------------------------------------------------------
> HeapTaskDaemon:(2) | 2602.109 ms | 4025 | avg: 0.231 ms | max: 19.152 ms | max at: 522.903934 s
> surfaceflinger:7478 | 18.994 ms | 1206 | avg: 0.189 ms | max: 17.375 ms | max at: 520.523061 s
> ksoftirqd/3:30 | 0.093 ms | 5 | avg: 3.328 ms | max: 16.567 ms | max at: 522.903871 s

I messed up the change log, just to clarify - the first result is without
patch (bad) and the second result is with patch (good).

thanks,

- Joel