Re: -next March 3: Boot failure on x86 (Oops)

From: Tejun Heo
Date: Fri Mar 05 2010 - 01:08:21 EST


Hello,

On 03/04/2010 02:23 PM, Sachin Sant wrote:
>> Can you please feed the address to gdb and get the line number? Also,
>> is it reproducible on mainline?
>>
> I can recreate this with latest git as well (2.6.33-git9 [eaa5eec7..])
>
> Disassembly from 2.6.33-git9 code base follows :
>
> /usr/local/autobench/var/tmp/build/linux/mm/percpu.c:1137
> if (off >= 0)
> e91: 0f 89 fd 00 00 00 jns f94 <pcpu_alloc+0x2bd>
> /usr/local/autobench/var/tmp/build/linux/mm/percpu.c:1116
> }
>
> restart:
> /* search through normal chunks */
> for (slot = pcpu_size_to_slot(size); slot < pcpu_nr_slots; slot++) {
> list_for_each_entry(chunk, &pcpu_slot[slot], list) {
> e97: 8b 45 84 mov -0x7c(%ebp),%eax
> e9a: 8b 00 mov (%eax),%eax
> e9c: 89 45 84 mov %eax,-0x7c(%ebp)
> prefetch():
> /usr/local/autobench/var/tmp/build/linux/arch/x86/include/asm/processor.h:886
>
> e9f: 8b 55 84 mov -0x7c(%ebp),%edx
> ea2: 8b 02 mov (%edx),%eax
>
> ^^^^^^^^^^^^^^^^^^^ EIP corresponds to this line

Hmmm... this means that on one of the chunks, chunk->list.next was
NULL (BTW, the disassembly is from unlinked object, right?). The main
allocation code hasn't seen much change lately. The only changes are,

22b737f4c75197372d64afc6ed1bccd58c00e549 : just refactoring
833af8427be4b217b5bc522f61afdbd3f1d282c2 : possible but isn't very new

Another possibility could be that the data structure before it was
overrun and corrupted the list part. pcpu_chunk is allocated with
variable size array attached at the end, so maybe I screwed up
calculation somewhere? This could explain the difference between 64
and 32bits. If you add padding at the head of struct pcpu_chunk, say,
unsigned long pad[16], does the problem go away?

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/