Re: [RFC PATCH v7 11/23] sched/fair: core wide cfs task priority comparison

From: chris hyser
Date: Wed Sep 16 2020 - 16:55:57 EST


On 9/16/20 10:24 AM, chris hyser wrote:
On 9/16/20 8:57 AM, Li, Aubrey wrote:
Here are the uperf results of the various patchsets. Note, that disabling smt is better for these tests and that that presumably reflects the overall overhead of core scheduling which went from bad to really bad. The primary focus in this email is to start to understand what happened within core sched itself.

patchset          smt=on/cs=off  smt=off    smt=on/cs=on
--------------------------------------------------------
v5-v5.6.y      :    1.78Gb/s     1.57Gb/s     1.07Gb/s
pre-v6-v5.6.y  :    1.75Gb/s     1.55Gb/s    822.16Mb/s
v6-5.7         :    1.87Gs/s     1.56Gb/s    561.6Mb/s
v6-5.7-hotplug :    1.75Gb/s     1.58Gb/s    438.21Mb/s
v7             :    1.80Gb/s     1.61Gb/s    440.44Mb/s

I haven't had a chance to play with v7, but I got something different.

   branch        smt=on/cs=on
coresched/v5-v5.6.y    1.09Gb/s
coresched/v6-v5.7.y    1.05Gb/s

I attached my kernel config in case you want to make a comparison, or you
can send yours, I'll try to see I can replicate your result.

I will give this config a try. One of the reports forwarded to me about the drop in uperf perf was an email from you I believe mentioning a 50% perf drop between v5 and v6?? I was actually setting out to duplicate your results. :-)

The first thing I did was to verify I built and tested the right bits. Presumably as I get same numbers. I'm still trying to tweak your config to get a root disk in my setup. Oh, one thing I missed in reading your first response, I had 24 cores/48 cpus. I think you had half that, though my guess is that that should have actually made the numbers even worse. :-)

The following was forwarded to me originally sent on Aug 3, by you I believe:

We found uperf(in cgroup) throughput drops by ~50% with corescheduling.

The problem is, uperf triggered a lot of softirq and offloaded softirq
service to *ksoftirqd* thread.

- default, ksoftirqd thread can run with uperf on the same core, we saw
100% CPU utilization.
- coresched enabled, ksoftirqd's core cookie is different from uperf, so
they can't run concurrently on the same core, we saw ~15% forced idle.

I guess this kind of performance drop can be replicated by other similar
(a lot of softirq activities) workloads.

Currently core scheduler picks cookie-match tasks for all SMT siblings, does
it make sense we add a policy to allow cookie-compatible task running together?
For example, if a task is trusted(set by admin), it can work with kernel thread.
The difference from corescheduling disabled is that we still have user to user
isolation.

Thanks,
-Aubrey

Would you please elaborate on what this test was? In trying to duplicate this, I just kept adding uperf threads to my setup until I started to see performance losses similar to what is reported above (and a second report about v7). Also, I wasn't looking for absolute numbers per-se, just significant enough differences to try to track where the performance went.

-chrish