Re: CPU scheduler weirdness?

From: Marton Balint
Date: Sat Aug 29 2009 - 10:16:10 EST




On Thu, 20 Aug 2009, Marton Balint wrote:


On Thu, 20 Aug 2009, Ingo Molnar wrote:


* Marton Balint <cus@xxxxxxxxxx> wrote:


On Wed, 19 Aug 2009, Peter Zijlstra wrote:

On Wed, 2009-08-19 at 14:34 +0200, Marton Balint wrote:

On Wed, 19 Aug 2009, Peter Zijlstra wrote:

On Wed, 2009-08-19 at 14:01 +0200, Marton Balint wrote:
On Wed, 19 Aug 2009, Peter Zijlstra wrote:
On Tue, 2009-08-18 at 21:49 +0200, Marton Balint wrote:

In the meantime, I was able to create a tiny C program which always
succesfully reproduces the bug. It's basically an endless loop which does
not stop while the process is running on the last CPU core. The program
creates multiple instances of itself, to be able to keep all of the CPU
cores busy. After 1 second, the processes running on other than the last
CPU core die, the processes running on the last CPU core remain stuck
there...

I tested it on my dual core system, if someone could test it on a quad
core and report back that would probably be useful.

Usage: ./schedtest <number of CPU cores>

And don't forget to kill the stuck processes after using the program! :)

So what's the bug? Sure one task will stay on the cpu, and because there
is no contention it doesn't get migrated, and therefore won't quit,
how's that a problem?

Problem is that more than one processes remain on that CPU core, and none
of them get migrated to other (idle) cores. I tested it with my E8400
processor and 2.6.31-rc5-git3 kernel.

Only one remains here.. on a c2q running 2.6.31-rc6-tip

Do you have a .config handy?


Yes it's in my original post:

http://marc.info/?l=linux-kernel&m=125012584709800&w=2

Right you are,.. so I build a kernel with the cgroup scheduler in and
tested it on a dual-core opteron machine, but I can't seem to reproduce
this.

Are you using cgroups in any way, or do you simply have it enabled in
your config?

No, it's just enabled. Actually the kernel is from the
openSUSE build service:

http://download.opensuse.org/repositories/Kernel:/HEAD/openSUSE_11.1/x86_64/

But the problem is present for both the kernel-default
kernel and the kernel-vanilla kernel which does not
contain any suse-specific patches.

This evening I had a bit more time to test, and I've
made a surprising discovery: I can only reproduce the
bug if the kernel module of my TV tuner card is loaded.
I have a Leadtek Winfast 2000 XP Expert TV card, it
uses the cx8800 kernel module. It seems that the
problem is somehow related to the infrared sensor of
the TV card, because I recompiled the module with the
'case CX88_BOARD_WINFAST2000XP_EXPERT:' line removed
from cx88-input.c and I couldn't reproduce the bug with
the new kernel module.

Extremely weird. Are timers somehow busted?

How can I check that?

In the meantime, I updated my original C program and also created a kernel module (schedtest_mod.c) which causes the same scheduling problems as the kernel module of my TV card. The kernel module is a skeleton of the infrared sensor polling code in cx88-input.c. It uses schedule_delayed_work, this seems to cause the problem. The C program (schedtest.c) is also updated, it now detects the number of CPU cores, from now, what you can set as a command line parameter is the CPU core number, on which the schedtest processes will not quit. (previously this was always the last core).

So to reproduce the bug on a dual core system, compile and insert the kernel module (schedtest_mod.c). Then check dmesg, it should contain on which CPU core is the delayed_work running. You should use the CPU core id of the _other_ CPU core as a command line parameter to the updated schedtest program.

And by the way, thank you guys for the help so far, hopefully we'll get to the bottom of this :)

I reproduced the bug with the previously provided kernel module and C program on a different computer (it's a laptop with a core2 duo P8400 CPU), and also bisected the bug to this commit:

sched: fine-tune SD_MC_INIT:
14800984706bf6936bbec5187f736e928be5c218

If I add again the removed SD_BALANCE_NEWIDLE to flags, then everything works as expected. So what would be the correct fix for this bug? Revert the patch? Or just add SD_BALANCE_NEWIDLE to flags?

Regards,
Marton
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/