Re: CPU scheduler weirdness?

From: Marton Balint
Date: Tue Aug 18 2009 - 15:49:43 EST


Hi,

On Thu, 13 Aug 2009, Andreas Mohr wrote:
On Thu, Aug 13, 2009 at 05:39:10PM +0200, Marton Balint wrote:
Does anybody have any idea what can cause this?

/sys/devices/system/cpu/sched_smt_power_savings , perhaps?

Thanks for the tip, tuning the sched_mc_power_savings setting helped! The
original value of it was 0, but after setting it to 1, the two
cpu-intensive processes got scheduled to different CPU cores, as
expected.

Heh, I did expect it to not help, and indeed that thing helping in this
way points to a... BUG, plain and simple.

http://lwn.net/Articles/297306/
lists possible settings as

"
The power savings and performance of the given workload in an under
utilised system can be controlled by setting values of 0, 1 or 2 to
/sys/devices/system/cpu/sched_mc_power_savings with 0 being highest
performance and least power savings and level 2 indicating maximum
power savings even at the cost of slight performance degradation.
"

which is exactly opposite to what I'd have expected to be normal,
unconfigured behaviour in your case.

Setting it back to 0 casused the two cpu-intensive processes to run on
the same CPU again. So I guess I will just set it to 1 after booting the
system.

...which would indicate a level=1 or level=2 (maximum powersaving)
behaviour. Something either seems reversed or really weird.
But it could just be opaque if correct behaviour due to a much more complex
load balancing algo in the scheduler or so.

Comments, anyone?

In the meantime, I was able to create a tiny C program which always succesfully reproduces the bug. It's basically an endless loop which does not stop while the process is running on the last CPU core. The program creates multiple instances of itself, to be able to keep all of the CPU cores busy. After 1 second, the processes running on other than the last CPU core die, the processes running on the last CPU core remain stuck there...

I tested it on my dual core system, if someone could test it on a quad core and report back that would probably be useful.

Usage: ./schedtest <number of CPU cores>

And don't forget to kill the stuck processes after using the program! :)

Regards,
Marton#define _GNU_SOURCE
#include <utmpx.h>
#include <sys/time.h>
#include <unistd.h>

/* Usage: ./schedtest <number of cpu cores> */

int miliseconds() {
struct timeval tv;
gettimeofday(&tv, 0);
return tv.tv_usec/1000;
}

int main(int argc, char *argv[]) {
int lives = 1000, time, lasttime, childs, cores = 2;
if (argc > 1)
cores = atoi(argv[1]);
childs = cores * 2;
while (childs-- && !fork());
while (lives) {
time = miliseconds();
if (lasttime != time && sched_getcpu() != (cores-1))
lives--;
lasttime = time;
}
return 0;
}