Re: [patch] sched-HT-2.6.0-test11-A5

From: Anton Blanchard
Date: Sun Dec 07 2003 - 11:44:32 EST



Hi,

> i've seen a similar crash once on a 2-way (4-way) HT box, so there some
> startup race going on most likely.

Im seeing bootup crashes every now and then on a ppc64 box too. A few
other things Ive noticed:

- nr_running looks to be wrong. On an idle machine just after booting:

00:07:20 up 14 min, 3 users, load average: 8.00, 7.67, 4.95

Its a 4 core 8 thread machine, so perhaps we are counting idle threads.

- The printk had me confused, we are really mapping cpu2 onto cpu1s runqueue.
Patch below.

- I tried the HT scheduler with NUMA enabled. Same machine, 4 core 8
threads, each NUMA node has 2 cores, 4 threads. Its easy to end up in a sub
optimal state:

Cpu0 : 0.0% user, 0.0% system, 0.0% nice, 100.0% idle, 0.0% IO-wait
Cpu1 : 0.0% user, 0.0% system, 0.0% nice, 100.0% idle, 0.0% IO-wait
Cpu2 : 100.0% user, 0.0% system, 0.0% nice, 0.0% idle, 0.0% IO-wait
Cpu3 : 100.0% user, 0.0% system, 0.0% nice, 0.0% idle, 0.0% IO-wait

Cpu4 : 100.0% user, 0.0% system, 0.0% nice, 0.0% idle, 0.0% IO-wait
Cpu5 : 0.0% user, 0.0% system, 0.0% nice, 100.0% idle, 0.0% IO-wait
Cpu6 : 100.0% user, 0.0% system, 0.0% nice, 0.0% idle, 0.0% IO-wait
Cpu7 : 0.7% user, 0.7% system, 0.0% nice, 98.6% idle, 0.0% IO-wait

cpu0/1 are an SMT pair, cpu 0-3 are a NUMA node. As you can see cpu0/1
is free and cpu2/3 is busy on both threads. So far we have noticed
nr_cpus_node should probably be nr_runqueues_node now, otherwise the
inter node balancing code could make bad decisions. However in this case
the imbalance is within the node, so Im not sure why cpu0/1 runqueue
hasnt stolen a task from cpu2/3.

Anton

--- foo/kernel/sched.c.ff 2003-12-03 02:03:41.000000000 -0600
+++ foo/kernel/sched.c 2003-12-04 11:37:40.980022085 -0600
@@ -1452,7 +1452,7 @@
runqueue_t *rq2 = cpu_rq(cpu2);
int cpu2_idx_orig = cpu_idx(cpu2), cpu2_idx;

- printk("mapping CPU#%d's runqueue to CPU#%d's runqueue.\n", cpu1, cpu2);
+ printk("mapping CPU#%d's runqueue to CPU#%d's runqueue.\n", cpu2, cpu1);
BUG_ON(rq1 == rq2 || rq2->nr_running || rq_idx(cpu1) != cpu1);
/*
* At this point, we dont have anything in the runqueue yet. So,
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/