Re: Gang scheduling

From: Subhra Mazumdar
Date: Mon Oct 15 2018 - 18:50:09 EST




On 10/12/2018 11:01 AM, Tim Chen wrote:
On 10/10/2018 05:09 PM, Subhra Mazumdar wrote:
Hi,

I was following the Coscheduling patch discussion on lkml and Peter mentioned he had a patch series. I found the following on github.

https://github.com/pdxChen/gang/commits/sched_1.23-loadbal

I would like to test this with KVMs. Are the commits from 38d5acb to f019876 sufficient? Also is there any documentaion on how to use it (any knobs I need to turn on for gang scheduling to happen?) or is it enabled by default for KVMs?

Thanks,
Subhra

I would suggest you try
https://github.com/pdxChen/gang/tree/sched_1.23-base
without the load balancing part of gang scheduling.
It is enabled by default for KVMs.

Due to the constant change in gang scheduling status of the QEMU thread
depending on whether vcpu is loaded or unloaded,
the load balancing part of the code doesn't work very well.
Thanks. Does this mean each vcpu thread need to be affinitized to a cpu?

The current version of the code need to be optimized further. Right now
the QEMU thread constantly does vcpu load and unload during VM enter and exit.
We gang schedule only after vcpu load and register the thread to be gang
scheduled. When we do vcpu unload, the thread is removed from the set
to be gang scheduled. Each time there's a synchronization with the
sibling thread that's expensive.

However, for QEMU, there's a one to one correspondence between the QEMU
thread and vcpu. So we don't have to change the gang scheduling status
for such thread to avoid the church and sync with the sibling. That should
be helpful for VM with lots of I/O causing constant VM exits. We're
still working on this optimization. And the load balancing should be
better after this change.

Tim


Also FYI I get the following error while building sched_1.23-base:

ERROR: "sched_ttwu_pending" [arch/x86/kvm/kvm-intel.ko] undefined!
scripts/Makefile.modpost:92: recipe for target '__modpost' failed

Adding the following fixed it:

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 46807dc..302b77d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -21,6 +21,7 @@
Â#include <trace/events/sched.h>

ÂDEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
+EXPORT_SYMBOL_GPL(sched_ttwu_pending);

Â#if defined(CONFIG_SCHED_DEBUG) && defined(HAVE_JUMP_LABEL)
Â/*