Re: 2.6.12-mm1 boot failure on NUMA box.

From: Martin J. Bligh
Date: Tue Jun 21 2005 - 09:32:49 EST


>> OK, after fixing the build failure with Andy's patch here:
>>
>> http://mbligh.org/abat/apw_pci_assign_unassigned_resources
>
> yup, I have that now.

Sweet, thanks.

>> I get a boot failure on the NUMA-Q box. Full log is here:
>>
>> http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/6184/debug/console.log
>>
>> But at the end it prints out lots of wierd scheduler stuff, then one more
>> message, then dies:
>>
>> | migration cost matrix (max_cache_size: 2097152, cpu: 700 MHz):
>> ---------------------
>
> That's Ingo debug stuff.
>
>> --------------------------------
>> NET: Registered protocol family 16
>
> Well it got up to core_initcall(netlink_proto_init);
>
>> I guess I'll try backing out the scheduler patches unless someone else
>> has a brighter idea?
>
> It doesn't look like a scheduler thing. Tried enabling initcall_debug?

Humpf. I agree it seemed to get a bit further than that, but I kicked off
a new test before I went to bed, and it does seem to work w/o the sched
patches:

http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/regression_matrix.html

see the "moe" column, comparing:

2.6.12-mm1
+apw_pci_assign_unass
+nosched_2.6.12-mm1

vs

2.6.12-mm1
+apw_pci_assign_unass

rows ?

I can still do initcall debug if you want. Or I guess it's binary chop
search amongst sched patches (or at least the ones that are new in
this -mm ?)

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/