Re: [PATCH 05/05] x86: Rename paravirtualized TSC functions

From: Yinghai Lu
Date: Thu Jul 10 2008 - 03:34:21 EST


On Thu, Jul 10, 2008 at 12:22 AM, Ingo Molnar <mingo@xxxxxxx> wrote:
>
> * Yinghai Lu <yhlu.kernel@xxxxxxxxx> wrote:
>
>> On Wed, Jul 9, 2008 at 10:30 AM, Alok Kataria <akataria@xxxxxxxxxx> wrote:
>> > On Wed, 2008-07-09 at 00:20 -0700, Yinghai Lu wrote:
>> >> On Tue, Jul 8, 2008 at 11:13 PM, Ingo Molnar <mingo@xxxxxxx> wrote:
>> >> >
>> >> > * Alok Kataria <akataria@xxxxxxxxxx> wrote:
>> >> >
>> >
>> >> got
>> >> calling ixgb_init_module+0x0/0x76
>> >> Intel(R) PRO/10GbE Network Driver - version 1.0.126-k4
>> >> Copyright (c) 1999-2006 Intel Corporation.
>> >> vendor=1022 device=7458
>> >> ACPI: PCI Interrupt 0000:0c:01.0[A] -> GSI 56 (level, low) -> IRQ 56
>> >> ixgb: eth18: ixgb_probe: Intel(R) PRO/10GbE Network Connection
>> >> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
>> >> IP: [<ffffffff80253b17>] hrtick_start_fair+0x89/0x173
>> >> PGD 0
>> >> Oops: 0000 [1] SMP
>> >> CPU 4
>> >> Modules linked in:
>> >> Pid: 103, comm: events/4 Not tainted 2.6.26-rc9-tip-00026-geae1aa0-dirty #240
>> >> RIP: 0010:[<ffffffff80253b17>] [<ffffffff80253b17>]
>> >> hrtick_start_fair+0x89/0x173
>> >> RSP: 0018:ffff88082481fbd0 EFLAGS: 00010046
>> >> RAX: 0000000000000000 RBX: ffff880824828000 RCX: 0000000000000006
>> >> RDX: ffff88084423e700 RSI: ffff880824828000 RDI: ffff88084423e700
>> >> RBP: ffff88082481fc00 R08: 0000000000000000 R09: 0000000000000000
>> >> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88084423e700
>> >> R13: ffff880844239700 R14: ffff880824828000 R15: 0000000000000c1a
>> >> FS: 0000000000000000(0000) GS:ffff881024c39600(0000) knlGS:0000000000000000
>> >> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
>> >> CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006e0
>> >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> >> Process events/4 (pid: 103, threadinfo ffff88082481e000, task ffff880824814230)
>> >> Stack: ffff880844248118 000000005c0ee9e3 ffff880844239700 ffff88082490d878
>> >> ffff88084423e700 0000000000000000 ffff88082481fc40 ffffffff80256591
>> >> ffff88082481fc30 ffffffff80516fc5 000000005c0ee9e3 ffff88082490d840
>> >> Call Trace:
>> >> [<ffffffff80256591>] dequeue_task_fair+0x5f/0x7e
>> >> [<ffffffff80516fc5>] ? __first_cpu+0x26/0x49
>> >> [<ffffffff80252792>] dequeue_task+0xce/0xf0
>> >> [<ffffffff80252835>] deactivate_task+0x31/0x50
>> >> [<ffffffff80252b93>] pull_task+0x2c/0x78
>> >> [<ffffffff80254c04>] load_balance_fair+0x18d/0x277
>> >> [<ffffffff80a49f02>] schedule+0x3db/0x962
>> >> [<ffffffff802c0f0f>] ? vmstat_update+0x0/0x5e
>> >> [<ffffffff802787c9>] ? schedule_delayed_work+0x31/0x48
>> >> [<ffffffff802779ee>] worker_thread+0xbb/0x114
>> >> [<ffffffff8027c5ec>] ? autoremove_wake_function+0x0/0x63
>> >> [<ffffffff80277933>] ? worker_thread+0x0/0x114
>> >> [<ffffffff8027c147>] kthread+0x61/0xa4
>> >> [<ffffffff802614e5>] ? schedule_tail+0x36/0x81
>> >> [<ffffffff8022b509>] child_rip+0xa/0x11
>> >> [<ffffffff8027c0e6>] ? kthread+0x0/0xa4
>> >> [<ffffffff8022b4ff>] ? child_rip+0x0/0x11
>> >>
>> >>
>> >> Code: c3 80 e8 86 04 01 00 80 3d 76 52 c0 00 00 0f 89 e1 00 00 00 41
>> >> f6 84 24 70 08 00 00 04 0f 85 d2 00 00 00 49 8b 84 24 a8 08 00 00 <48>
>> >> 8b 00 83 b8 c0 00 00 00 00 0f 84 ba 00 00 00 49 83 7d 10 01
>> >> RIP [<ffffffff80253b17>] hrtick_start_fair+0x89/0x173
>> >> RSP <ffff88082481fbd0>
>> >> CR2: 0000000000000000
>> >> ---[ end trace c05d5c1f5b126388 ]---
>> >>
>> >> yesterday tip/mater with tip/x86/modules
>> >> tip-history-2008-07-08_16.08_Tue works well.
>> >>
>> >> others traps merge seems not to cause the problem..
>> >>
>> >
>> > Hi Yinghai,
>> >
>> > Are we sure that these patches cause this null pointer dereference ?
>> > The panic in scheduler seems to be totally disconnected to the changes
>> > that these patches make. The only scheduler bit that we touch is the
>> > sched_clock thingy....but that too looks harmless to me.
>> >
>> > Can you please bisect and see if the first patch in the series is the
>> > problem ?
>>
>> tries last night, it seems pgtable related patches cause that.
>
> that would be the tip/xen64 stuff, right? Does this revert:
>
> | Revert "x86_64: there's no need to preallocate level1_fixmap_pgt"
> |
> | This reverts commit 033786969d1d1b5af12a32a19d3a760314d05329.
> |
> | Suresh Siddha reported that this broke booting on his 2GB testbox.
>
> solve your problems, or are there other problems still?

still is bisecting it now...

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/