Re: inux-next: Tree for Sept 26 (not bootable on AMD64:thermal|acpi|drm/i915|pci related?)

From: Zhang Rui
Date: Thu Sep 27 2012 - 02:07:33 EST


Hi, hugh,

On ä, 2012-09-26 at 12:51 -0700, Hugh Dickins wrote:
> On Wed, 26 Sep 2012, Sedat Dilek wrote:
> >
> > on my Ubuntu/precise AMD64 today's Linux-Next runs into the following
> > call-trace (machine freezes):
> >
> > Sep 26 19:22:58 fambox kernel: [ 11.124739] BUG: unable to handle
> > kernel NULL pointer dereference at 0000000000000018
> > Sep 26 19:22:58 fambox kernel: [ 11.124806] IP: [<ffffffff814bb058>]
> > thermal_cooling_device_register+0x2c8/0x3d0
> > Sep 26 19:22:58 fambox kernel: [ 11.124869] PGD 0
> > Sep 26 19:22:58 fambox kernel: [ 11.124895] Oops: 0000 [#1] SMP
> > Sep 26 19:22:58 fambox kernel: [ 11.124919] Modules linked in:
> > coretemp kvm_intel kvm snd_hda_intel(+) arc4 snd_hda_codec iwldvm
> > snd_hwdep ghash_clmulni_intel snd_pcm aesni_intel uvcvideo mac80211
> > aes_x86_64 snd_page_alloc ablk_helper i915(+) snd_seq_midi cryptd
> > videobuf2_vmalloc snd_seq_midi_event xts videobuf2_memops lrw
> > snd_rawmidi videobuf2_core joydev gf128mul videodev snd_seq
> > snd_seq_device hid_generic snd_timer drm_kms_helper iwlwifi drm snd
> > psmouse i2c_algo_bit soundcore btusb microcode serio_raw
> > samsung_laptop wmi cfg80211 bluetooth mei mac_hid video lpc_ich lp
> > parport ext4 jbd2 usbhid hid r8169
> > Sep 26 19:22:58 fambox kernel: [ 11.125319] CPU 2
> > Sep 26 19:22:58 fambox kernel: [ 11.125332] Pid: 579, comm: modprobe
> > Not tainted 3.6.0-rc7-next20120926-2-iniza-generic #1 SAMSUNG
> > ELECTRONICS CO., LTD. 530U3BI/530U4BI/530U4BH/530U3BI/530U4BI/530U4BH
> > Sep 26 19:22:58 fambox kernel: [ 11.125401] RIP:
> > 0010:[<ffffffff814bb058>] [<ffffffff814bb058>]
> > thermal_cooling_device_register+0x2c8/0x3d0
> > Sep 26 19:22:58 fambox kernel: [ 11.125450] RSP:
> > 0018:ffff88010f23d838 EFLAGS: 00010246
> > Sep 26 19:22:58 fambox kernel: [ 11.125475] RAX: 0000000000000000
> > RBX: ffff88010bfd4c00 RCX: 0000000000000001
> > Sep 26 19:22:58 fambox kernel: [ 11.125507] RDX: 0000000000000000
> > RSI: 0000000000000282 RDI: 0000000000000282
> > Sep 26 19:22:58 fambox kernel: [ 11.125539] RBP: ffff88010f23d878
> > R08: 0000000000000000 R09: 0000000000000001
> > Sep 26 19:22:58 fambox kernel: [ 11.125570] R10: ffff8801054fc460
> > R11: 0000000000000000 R12: ffff88010bfd4c04
> > Sep 26 19:22:58 fambox kernel: [ 11.125602] R13: ffff88011a73e000
> > R14: 0000000000000000 R15: 0000000000000000
> > Sep 26 19:22:58 fambox kernel: [ 11.125634] FS:
> > 00007f6b8da29700(0000) GS:ffff88011fa80000(0000)
> > knlGS:0000000000000000
> > Sep 26 19:22:58 fambox kernel: [ 11.125670] CS: 0010 DS: 0000 ES:
> > 0000 CR0: 0000000080050033
> > Sep 26 19:22:58 fambox kernel: [ 11.125697] CR2: 0000000000000018
> > CR3: 0000000110185000 CR4: 00000000000407e0
> > Sep 26 19:22:58 fambox kernel: [ 11.125729] DR0: 0000000000000000
> > DR1: 0000000000000000 DR2: 0000000000000000
> > Sep 26 19:22:58 fambox kernel: [ 11.125761] DR3: 0000000000000000
> > DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Sep 26 19:22:58 fambox kernel: [ 11.125793] Process modprobe (pid:
> > 579, threadinfo ffff88010f23c000, task ffff880116648000)
> > Sep 26 19:22:58 fambox kernel: [ 11.127133] Stack:
> > Sep 26 19:22:58 fambox kernel: [ 11.128461] ffff88011a5d9098
> > ffff88010d74c300 ffff88010f23d878 ffff880110b06480
> > Sep 26 19:22:58 fambox kernel: [ 11.129755] ffff88010bfd1000
> > ffff88011a5d9098 ffff88010d74c300 ffff88011a625000
> > Sep 26 19:22:58 fambox kernel: [ 11.130943] ffff88010f23d958
> > ffffffffa00e5868 0000000000000000 ffffffff811fa033
> > Sep 26 19:22:58 fambox kernel: [ 11.132749] Call Trace:
> > Sep 26 19:22:58 fambox kernel: [ 11.133939] [<ffffffffa00e5868>]
> > acpi_video_bus_add+0x9ba/0xce6 [video]
> > Sep 26 19:22:58 fambox kernel: [ 11.135130] [<ffffffff811fa033>] ?
> > sysfs_addrm_finish+0x33/0xc0
> > Sep 26 19:22:58 fambox kernel: [ 11.136313] [<ffffffff813454cc>]
> > acpi_device_probe+0x4e/0x11c
> > Sep 26 19:22:58 fambox kernel: [ 11.137482] [<ffffffff813d472b>]
> > driver_probe_device+0x7b/0x240
> > Sep 26 19:22:58 fambox kernel: [ 11.138642] [<ffffffff813d499b>]
> > __driver_attach+0xab/0xb0
> > Sep 26 19:22:58 fambox kernel: [ 11.139798] [<ffffffff813d48f0>] ?
> > driver_probe_device+0x240/0x240
> > Sep 26 19:22:58 fambox kernel: [ 11.140981] [<ffffffff813d2b46>]
> > bus_for_each_dev+0x56/0x90
> > Sep 26 19:22:58 fambox kernel: [ 11.142129] [<ffffffff813d425e>]
> > driver_attach+0x1e/0x20
> > Sep 26 19:22:58 fambox kernel: [ 11.143264] [<ffffffff813d3dd0>]
> > bus_add_driver+0x190/0x290
> > Sep 26 19:22:58 fambox kernel: [ 11.144443] [<ffffffff813d4efa>]
> > driver_register+0x7a/0x160
> > Sep 26 19:22:58 fambox kernel: [ 11.145583] [<ffffffff81345ccf>]
> > acpi_bus_register_driver+0x43/0x45
> > Sep 26 19:22:58 fambox kernel: [ 11.146871] [<ffffffffa00e4dac>]
> > acpi_video_register+0x20/0x39 [video]
> > Sep 26 19:22:58 fambox kernel: [ 11.148167] [<ffffffffa02f4bad>]
> > i915_driver_load+0x83d/0xea0 [i915]
> > Sep 26 19:22:58 fambox kernel: [ 11.149451] [<ffffffffa020ebc1>]
> > drm_get_pci_dev+0x191/0x2b0 [drm]
> > Sep 26 19:22:58 fambox kernel: [ 11.150739] [<ffffffffa0345e2b>]
> > i915_pci_probe+0x4f/0x57 [i915]
> > Sep 26 19:22:58 fambox kernel: [ 11.152015] [<ffffffff81309af9>]
> > local_pci_probe+0x79/0x100
> > Sep 26 19:22:58 fambox kernel: [ 11.153287] [<ffffffff8130b1f9>]
> > pci_device_probe+0x109/0x130
> > Sep 26 19:22:58 fambox kernel: [ 11.154546] [<ffffffff813d472b>]
> > driver_probe_device+0x7b/0x240
> > Sep 26 19:22:58 fambox kernel: [ 11.155796] [<ffffffff813d499b>]
> > __driver_attach+0xab/0xb0
> > Sep 26 19:22:58 fambox kernel: [ 11.157048] [<ffffffff813d48f0>] ?
> > driver_probe_device+0x240/0x240
> > Sep 26 19:22:58 fambox kernel: [ 11.158280] [<ffffffff813d2b46>]
> > bus_for_each_dev+0x56/0x90
> > Sep 26 19:22:58 fambox kernel: [ 11.159496] [<ffffffff813d425e>]
> > driver_attach+0x1e/0x20
> > Sep 26 19:22:58 fambox kernel: [ 11.160708] [<ffffffff813d3dd0>]
> > bus_add_driver+0x190/0x290
> > Sep 26 19:22:58 fambox kernel: [ 11.161909] [<ffffffff813d4efa>]
> > driver_register+0x7a/0x160
> > Sep 26 19:22:58 fambox kernel: [ 11.163121] [<ffffffff8130a159>]
> > __pci_register_driver+0x49/0x50
> > Sep 26 19:22:58 fambox kernel: [ 11.164227] [<ffffffffa020edfa>]
> > drm_pci_init+0x11a/0x130 [drm]
> > Sep 26 19:22:58 fambox kernel: [ 11.165295] [<ffffffffa037c000>] ?
> > 0xffffffffa037bfff
> > Sep 26 19:22:58 fambox kernel: [ 11.166351] [<ffffffffa037c066>]
> > i915_init+0x66/0x68 [i915]
> > Sep 26 19:22:58 fambox kernel: [ 11.167360] [<ffffffff8100203f>]
> > do_one_initcall+0x3f/0x170
> > Sep 26 19:22:58 fambox kernel: [ 11.168332] [<ffffffff810bc9be>]
> > sys_init_module+0xbe/0x220
> > Sep 26 19:22:58 fambox kernel: [ 11.169276] [<ffffffff8164a0bd>]
> > system_call_fastpath+0x1a/0x1f
> > Sep 26 19:22:58 fambox kernel: [ 11.170200] Code: 1b 7e 00 48 3d c0
> > cb c9 81 4c 8d a8 b0 fc ff ff 0f 84 bf 00 00 00 0f 1f 44 00 00 4d 8b
> > bd f0 02 00 00 4d 85 ff 0f 84 c0 00 00 00 <49> 8b 77 18 48 85 f6 0f 84
> > d3 fe ff ff 41 8b 47 14 85 c0 7e 7b
> > Sep 26 19:22:58 fambox kernel: [ 11.172467] RIP
> > [<ffffffff814bb058>] thermal_cooling_device_register+0x2c8/0x3d0
> > Sep 26 19:22:58 fambox kernel: [ 11.173542] RSP <ffff88010f23d838>
> > Sep 26 19:22:58 fambox kernel: [ 11.174603] CR2: 0000000000000018
> > Sep 26 19:22:58 fambox kernel: [ 11.175675] ---[ end trace
> > 0f2bc437662fc097 ]---
> >
> > I have CCed some maintainers, hope they might have a look at it
> > (sorry, didnt check any MLs).
> >
> > My kern.log and kernel-config are attached.
>
> I get oops in bind_cdev() called from thermal_cooling_device_register()
> on mmotm, I expect yours is the same (but bind_cdev inlined so it doesn't
> appear in the backtrace). This patch gets me booting (note it also fixes
> return to break to unlock the mutex), does it help you? Whether the root
> problem is elsewhere or not, I've no idea; but I see Dan already spotted
> other errors hereabouts - probably needs more thorough review.
>


> Hugh
>
> --- mmotm/drivers/thermal/thermal_sys.c 2012-09-26 10:15:28.652071362 -0700
> +++ linux/drivers/thermal/thermal_sys.c 2012-09-26 11:41:01.549684983 -0700
> @@ -252,8 +252,8 @@ static void bind_cdev(struct thermal_coo
> }
>
> tzp = pos->tzp;
> - if (!tzp->tbp)
> - return;
> + if (!tzp || !tzp->tbp)
> + break;
>
> for (i = 0; i < tzp->num_tbps; i++) {
> if (tzp->tbp[i].cdev || !tzp->tbp[i].match)

thanks for the fix, this patch looks good to me.
could you please resend it in a proper format, say, with changelog and
Singed-off-by, so that I can apply to my -next tree?

thanks,
rui

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/