Re: [patch] sched: auto-tune migration costs [was: Re: Industry dbbenchmark result on recent 2.6 kernels]

From: Paul Jackson
Date: Sun Apr 03 2005 - 00:56:41 EST


Just so as no else wastes time repeating the little bit I've done so
far, and so I don't waste time figuring out what is already known,
here's what I have so far, trying out Ingo's "sched: auto-tune migration
costs" on ia64 SN2:

To get it to compile against 2.6.12-rc1-mm4, I did thus:

1. Manually edited "include/asm-x86_64/topology.h" to
remove .cache_hot_time (patch failed due to conflicts
with nearby changes to add some *_idx terms).
2. Moved the 394 line block of new code in kernel/sched.c
to _before_ the large #ifdef ARCH_HAS_SCHED_DOMAIN,
#else, #endif block. The ia64 arch (only) defines
ARCH_HAS_SCHED_DOMAIN, so was being denied use of Ingo's
code when it was buried in the '#else-#endif' side of
this large conditional block.
3. Add "#include <linux/vmalloc.h>" to kernel/sched.c
4. Don't print cpu_khz in the cost matrix header, as cpu_khz
is only in a few arch's (x86_64, ppc, i386, arm).

Note that (2) was just a superficial fix - it compiles, but the result
could easily be insanely stupid and I'd have no clue. I need to
read the code some more.

Booting on an 8 CPU ia64 SN2, the console output got far enough to show:

============================ begin ============================
Brought up 8 CPUs
softlockup thread 7 started up.
Total of 8 processors activated (15548.60 BogoMIPS).
---------------------
migration cost matrix (max_cache_size: 33554432):
---------------------
[00] [01] [02] [03] [04] [05] [06] [07]
[00]: -
============================= end =============================

Then it hung for 5 or 10 minutes, and then it blurted out a panic and
died. I'll quote the whole panic, including backtrace, in case someone
happens to see something obvious.

But I'm not asking anyone to think about this yet, unless it amuses
them. I can usefully occupy myself reading the code and adding printk's
for a while.

Note the first 3 chars of the panic message "4.5". This looks like it
might be the [00]-[01] entry of Ingo's table, flushed out when the
newlines of the panic came through.

============================ begin ============================
4.5(0)<1>Unable to handle kernel paging request at virtual address 0000000000010008
swapper[1]: Oops 8813272891392 [1]
Modules linked in:

Pid: 1, CPU 0, comm: swapper
psr : 0000101008026018 ifs : 8000000000000288 ip : [<a0000001000d9a30>] Not tainted
ip is at queue_work+0xb0/0x1a0
unat: 0000000000000000 pfs : 0000000000000288 rsc : 0000000000000003
rnat: a000000100ab2a50 bsps: 0000000000100000 pr : 5a66666956996a65
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a0000001000d99b0 b6 : a000000100003320 b7 : a000000100490200
f6 : 1003e0000000000009ff7 f7 : 1003e000418d3645db265
f8 : 1003e000000003b8186ed f9 : 1003e0000000000005f3b
f10 : 1003e0000000000001000 f11 : 1003e0000000000000040
r1 : a000000100c9de60 r2 : 0000000000000000 r3 : 0000000000000001
r8 : 0000000000000000 r9 : 0000000000000000 r10 : a000000100969c50
r11 : 0000000000000004 r12 : e00001b03a8d7910 r13 : e00001b03a8d0000
r14 : 0000000000000000 r15 : 0000000000010008 r16 : e00001b03a8d0dc0
r17 : 0000000000010008 r18 : 0000000000000103 r19 : a000000100c32048
r20 : a000000100c32018 r21 : a000000100aa92c8 r22 : e000003003005d90
r23 : e000003003005da8 r24 : a000000100cf2098 r25 : e000003003005db0
r26 : a000000100ab4bf4 r27 : e000003003005d81 r28 : 000000010004b001
r29 : 0000000000000000 r30 : 000000010004b000 r31 : a000000100c32010

Call Trace:
[<a000000100010460>] show_stack+0x80/0xa0
sp=e00001b03a8d74d0 bsp=e00001b03a8d1620
[<a000000100010d40>] show_regs+0x860/0x880
sp=e00001b03a8d76a0 bsp=e00001b03a8d15b8
[<a000000100036390>] die+0x170/0x200
sp=e00001b03a8d76b0 bsp=e00001b03a8d1580
[<a00000010005bb20>] ia64_do_page_fault+0x200/0xa40
sp=e00001b03a8d76b0 bsp=e00001b03a8d1520
[<a00000010000b2c0>] ia64_leave_kernel+0x0/0x290
sp=e00001b03a8d7740 bsp=e00001b03a8d1520
[<a0000001000d9a30>] queue_work+0xb0/0x1a0
sp=e00001b03a8d7910 bsp=e00001b03a8d14e0
[<a0000001000db0d0>] schedule_work+0x30/0x60
sp=e00001b03a8d7910 bsp=e00001b03a8d14c8
[<a000000100490230>] blank_screen_t+0x30/0x60
sp=e00001b03a8d7910 bsp=e00001b03a8d14b8
[<a0000001000c8130>] run_timer_softirq+0x2d0/0x4a0
sp=e00001b03a8d7910 bsp=e00001b03a8d1410
[<a0000001000bb920>] __do_softirq+0x220/0x260
sp=e00001b03a8d7930 bsp=e00001b03a8d1378
[<a0000001000bb9e0>] do_softirq+0x80/0xe0
sp=e00001b03a8d7930 bsp=e00001b03a8d1320
[<a0000001000bbc50>] irq_exit+0x90/0xc0
sp=e00001b03a8d7930 bsp=e00001b03a8d1310
[<a00000010000f4b0>] ia64_handle_irq+0x110/0x140
sp=e00001b03a8d7930 bsp=e00001b03a8d12d8
[<a00000010000b2c0>] ia64_leave_kernel+0x0/0x290
sp=e00001b03a8d7930 bsp=e00001b03a8d12d8
[<a000000100844b20>] read_cache+0x40/0x60
sp=e00001b03a8d7b00 bsp=e00001b03a8d12c8
[<a000000100844fb0>] target_handler+0xd0/0xe0
sp=e00001b03a8d7b00 bsp=e00001b03a8d1298
[<a000000100845150>] measure_one+0x190/0x240
sp=e00001b03a8d7b00 bsp=e00001b03a8d1260
[<a000000100845890>] measure_cacheflush_time+0x270/0x420
sp=e00001b03a8d7b30 bsp=e00001b03a8d1200
[<a0000001000a7350>] calibrate_cache_decay+0x710/0x740
sp=e00001b03a8d7b40 bsp=e00001b03a8d1148
[<a000000100056180>] arch_init_sched_domains+0x12c0/0x1e40
sp=e00001b03a8d7b60 bsp=e00001b03a8d0e80
[<a000000100845a60>] sched_init_smp+0x20/0x60
sp=e00001b03a8d7de0 bsp=e00001b03a8d0e70
[<a000000100009570>] init+0x250/0x440
sp=e00001b03a8d7de0 bsp=e00001b03a8d0e38
[<a000000100012940>] kernel_thread_helper+0xe0/0x100
sp=e00001b03a8d7e30 bsp=e00001b03a8d0e10
[<a000000100009120>] start_kernel_thread+0x20/0x40
sp=e00001b03a8d7e30 bsp=e00001b03a8d0e10
<0>Kernel panic - not syncing: Aiee, killing interrupt handler!
============================= end =============================


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@xxxxxxxxxxxx> 1.650.933.1373, 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/