Re: 4.18: early boot crash in thermal_cooling_device_destroy_sysfs

From: Zhang Rui
Date: Tue Oct 30 2018 - 21:07:15 EST


Hi, Randy,

On ä, 2018-10-26 at 20:35 -0700, Randy Dunlap wrote:
> On 10/26/18 2:14 AM, Rafael J. Wysocki wrote:
> >
> > On Monday, October 22, 2018 8:37:25 PM CEST Randy Dunlap wrote:
> > >
> > >
> > > On 8/16/18 2:33 PM, Randy Dunlap wrote:
> > > >
> > > > Hi,
> > > >
> > > > Sorry for the photo.ÂÂThat's all I have available so far.
> > > >
> > > > https://www.infradead.org/~rdunlap/doc/IMG_20180816_133254743_H
> > > > DR.jpg
> > > >
> > > >
> > > > Does anyone recognize this?
> > > >
> > > > This is an (older) Toshiba laptop.ÂÂThe kernel .config is
> > > > mostly an
> > > > allmodconfig with some DEBUG options disabled and other options
> > > > enabled
> > > > so that it can boot without using an initramfs.ÂÂ(and with
> > > > COMPILE_TEST
> > > > disabled :)
> > > >
> > > >
> > > > The full kernel .config file is attached.
> > > >
> > > > Thanks,
> > > >
> > > This is a result of CONFIG_DEBUG_TEST_DRIVER_REMOVE=y.
> > > [switch from 64-bit to 32-bit machine]
> > >
> > >
> > > When using CONFIG_DEBUG_VM=y, it BUGs at:
> > > [ÂÂÂÂ5.553603] ------------[ cut here ]------------
> > > [ÂÂÂÂ5.553733] kernel BUG at arch/x86/mm/physaddr.c:75!
> > > [ÂÂÂÂ5.557788] invalid opcode: 0000 [#1] PREEMPT SMP
> > > DEBUG_PAGEALLOC
> > > [ÂÂÂÂ5.558738] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.19.0-
> > > rc7 #4
> > > [ÂÂÂÂ5.558738] Hardware name: Dell Inc. Inspiron
> > > 1318ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ/0C236D, BIOS A04 01/15/2009
> > > [ÂÂÂÂ5.558738] EIP: __phys_addr+0x40/0x90
> > > [ÂÂÂÂ5.558738] Code: 00 40 75 2e 8b 15 00 57 23 d5 85 d2 74 12 89
> > > d9 c1 e9 0c 39 ca 72 5b e8 2e ca ff ff 39 d8 75 4a 89 d8 5b 5d c3
> > > 8d 74 26 00 90 <0f> 0b 8d b6 00 00 00 00 8b 0d 80 56 23 d5 8d 91
> > > 00 00 80 00 39 d0
> > > [ÂÂÂÂ5.558738] EAX: 6b6b6b6b EBX: 6b6b6b6b ECX: 00140011 EDX:
> > > 00000000
> > > [ÂÂÂÂ5.558738] ESI: f4890000 EDI: d4a58d60 EBP: f40c1e0c ESP:
> > > f40c1e08
> > > [ÂÂÂÂ5.558738] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> > > EFLAGS: 00210a97
> > > [ÂÂÂÂ5.558738] CR0: 80050033 CR2: 00000000 CR3: 14cad000 CR4:
> > > 000406d0
> > > [ÂÂÂÂ5.558738] Call Trace:
> > > [ÂÂÂÂ5.558738]ÂÂkfree+0x1f/0x160
> > > [ÂÂÂÂ5.558738]ÂÂthermal_cooling_device_destroy_sysfs+0x11/0x20
> > > [ÂÂÂÂ5.558738]ÂÂthermal_cooling_device_unregister+0x168/0x180
> > > [ÂÂÂÂ5.558738]ÂÂacpi_pss_perf_exit.isra.4+0x32/0x50
> > > [ÂÂÂÂ5.558738]ÂÂacpi_processor_stop+0x4d/0x60
> > > [ÂÂÂÂ5.558738]ÂÂreally_probe+0xa3/0x3e0
> > > [ÂÂÂÂ5.558738]ÂÂdriver_probe_device+0x5b/0x120
> > > [ÂÂÂÂ5.558738]ÂÂ__driver_attach+0xd9/0x100
> > > [ÂÂÂÂ5.558738]ÂÂ? driver_probe_device+0x120/0x120
> > > [ÂÂÂÂ5.558738]ÂÂbus_for_each_dev+0x56/0x90
> > > [ÂÂÂÂ5.558738]ÂÂdriver_attach+0x14/0x20
> > > [ÂÂÂÂ5.558738]ÂÂ? driver_probe_device+0x120/0x120
> > > [ÂÂÂÂ5.558738]ÂÂbus_add_driver+0x117/0x210
> > > [ÂÂÂÂ5.558738]ÂÂdriver_register+0x61/0xb0
> > > [ÂÂÂÂ5.558738]ÂÂacpi_processor_driver_init+0x19/0x88
> > > [ÂÂÂÂ5.558738]ÂÂ? acpi_pci_slot_init+0xf/0xf
> > > [ÂÂÂÂ5.558738]ÂÂdo_one_initcall+0x3e/0x15a
> > > [ÂÂÂÂ5.558738]ÂÂ? do_early_param+0x75/0x75
> > > [ÂÂÂÂ5.558738]ÂÂkernel_init_freeable+0x170/0x1f3
> > > [ÂÂÂÂ5.558738]ÂÂ? rest_init+0xcd/0xcd
> > > [ÂÂÂÂ5.558738]ÂÂkernel_init+0x8/0xdb
> > > [ÂÂÂÂ5.558738]ÂÂret_from_fork+0x2e/0x38
> > > [ÂÂÂÂ5.558738] Modules linked in:
> > > [ÂÂÂÂ5.625269] _warn_unseeded_randomness: 1 callbacks suppressed
> > > [ÂÂÂÂ5.625272] random: get_random_bytes called from
> > > init_oops_id+0x3a/0x40 with crng_init=0
> > > [ÂÂÂÂ5.629758] ---[ end trace 65b17bf4d18e7692 ]---
> > > [ÂÂÂÂ5.631573] EIP: __phys_addr+0x40/0x90
> > > [ÂÂÂÂ5.633242] Code: 00 40 75 2e 8b 15 00 57 23 d5 85 d2 74 12 89
> > > d9 c1 e9 0c 39 ca 72 5b e8 2e ca ff ff 39 d8 75 4a 89 d8 5b 5d c3
> > > 8d 74 26 00 90 <0f> 0b 8d b6 00 00 00 00 8b 0d 80 56 23 d5 8d 91
> > > 00 00 80 00 39 d0
> > > [ÂÂÂÂ5.638618] EAX: 6b6b6b6b EBX: 6b6b6b6b ECX: 00140011 EDX:
> > > 00000000
> > > [ÂÂÂÂ5.640703] ESI: f4890000 EDI: d4a58d60 EBP: f40c1e0c ESP:
> > > d4cb13dc
> > > [ÂÂÂÂ5.642801] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> > > EFLAGS: 00210a97
> > > [ÂÂÂÂ5.645053] CR0: 80050033 CR2: 00000000 CR3: 14cad000 CR4:
> > > 000406d0
> > > [ÂÂÂÂ5.647179] Kernel panic - not syncing: Fatal exception
> > > [ÂÂÂÂ5.648172] Kernel Offset: 0x13000000 from 0xc1000000
> > > (relocation range: 0xc0000000-0xf77fdfff)
> > > [ÂÂÂÂ5.648172] ---[ end Kernel panic - not syncing: Fatal
> > > exception ]---
> > >
> > >
> > > When not using CONFIG_DEBUG_VM, it BUGs in kfree:
> > > [ÂÂÂÂ5.497864] ------------[ cut here ]------------
> > > [ÂÂÂÂ5.498215] kernel BUG at mm/slub.c:3901!
> > > [ÂÂÂÂ5.501739] invalid opcode: 0000 [#1] PREEMPT SMP
> > > [ÂÂÂÂ5.502720] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-
> > > rc7 #3
> > > [ÂÂÂÂ5.502720] Hardware name: Dell Inc. Inspiron
> > > 1318ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ/0C236D, BIOS A04 01/15/2009
> > > [ÂÂÂÂ5.502720] EIP: kfree+0x117/0x150
> > > [ÂÂÂÂ5.502720] Code: 74 21 8b 06 31 d2 f6 c4 80 74 04 0f b6 56 31
> > > 89 f0 e8 7d e0 fa ff e9 7b ff ff ff 8d b4 26 00 00 00 00 90 8b 46
> > > 04 a8 01 75 d8 <0f> 0b 8d b4 26 00 00 00 00 8b 75 f0 ff 75 ec 89
> > > d9 89 f8 6a 01 53
> > > [ÂÂÂÂ5.502720] EAX: 00000100 EBX: 6b6b6b6b ECX: 00140011 EDX:
> > > 00000000
> > > [ÂÂÂÂ5.502720] ESI: f67dac70 EDI: ccc4aca0 EBP: f4083e28 ESP:
> > > f4083e10
> > > [ÂÂÂÂ5.502720] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> > > EFLAGS: 00210246
> > > [ÂÂÂÂ5.502720] CR0: 80050033 CR2: ffd14000 CR3: 0ce94000 CR4:
> > > 000406d0
> > > [ÂÂÂÂ5.502720] Call Trace:
> > > [ÂÂÂÂ5.502720]ÂÂthermal_cooling_device_destroy_sysfs+0x11/0x20
> > > [ÂÂÂÂ5.502720]ÂÂthermal_cooling_device_unregister+0x168/0x180
> > > [ÂÂÂÂ5.502720]ÂÂacpi_pss_perf_exit.isra.4+0x32/0x50
> > > [ÂÂÂÂ5.502720]ÂÂacpi_processor_stop+0x4d/0x60
> > > [ÂÂÂÂ5.502720]ÂÂreally_probe+0xa3/0x3e0
> > > [ÂÂÂÂ5.502720]ÂÂdriver_probe_device+0x5b/0x120
> > > [ÂÂÂÂ5.502720]ÂÂ__driver_attach+0xd9/0x100
> > > [ÂÂÂÂ5.502720]ÂÂ? driver_probe_device+0x120/0x120
> > > [ÂÂÂÂ5.502720]ÂÂbus_for_each_dev+0x56/0x90
> > > [ÂÂÂÂ5.502720]ÂÂdriver_attach+0x14/0x20
> > > [ÂÂÂÂ5.502720]ÂÂ? driver_probe_device+0x120/0x120
> > > [ÂÂÂÂ5.502720]ÂÂbus_add_driver+0x117/0x210
> > > [ÂÂÂÂ5.502720]ÂÂdriver_register+0x61/0xb0
> > > [ÂÂÂÂ5.502720]ÂÂacpi_processor_driver_init+0x19/0x88
> > > [ÂÂÂÂ5.502720]ÂÂ? acpi_pci_slot_init+0xf/0xf
> > > [ÂÂÂÂ5.502720]ÂÂdo_one_initcall+0x3e/0x15a
> > > [ÂÂÂÂ5.502720]ÂÂ? do_early_param+0x75/0x75
> > > [ÂÂÂÂ5.502720]ÂÂkernel_init_freeable+0x170/0x1f3
> > > [ÂÂÂÂ5.502720]ÂÂ? rest_init+0xcd/0xcd
> > > [ÂÂÂÂ5.502720]ÂÂkernel_init+0x8/0xdb
> > > [ÂÂÂÂ5.502720]ÂÂret_from_fork+0x2e/0x38
> > > [ÂÂÂÂ5.502720] Modules linked in:
> > > [ÂÂÂÂ5.567678] _warn_unseeded_randomness: 1 callbacks suppressed
> > > [ÂÂÂÂ5.567682] random: get_random_bytes called from
> > > init_oops_id+0x3a/0x40 with crng_init=0
> > > [ÂÂÂÂ5.572237] ---[ end trace 1b6e88c03e412db2 ]---
> > > [ÂÂÂÂ5.574099] EIP: kfree+0x117/0x150
> > > [ÂÂÂÂ5.575673] Code: 74 21 8b 06 31 d2 f6 c4 80 74 04 0f b6 56 31
> > > 89 f0 e8 7d e0 fa ff e9 7b ff ff ff 8d b4 26 00 00 00 00 90 8b 46
> > > 04 a8 01 75 d8 <0f> 0b 8d b4 26 00 00 00 00 8b 75 f0 ff 75 ec 89
> > > d9 89 f8 6a 01 53
> > > [ÂÂÂÂ5.581124] EAX: 00000100 EBX: 6b6b6b6b ECX: 00140011 EDX:
> > > 00000000
> > > [ÂÂÂÂ5.583243] ESI: f67dac70 EDI: ccc4aca0 EBP: f4083e28 ESP:
> > > cce983dc
> > > [ÂÂÂÂ5.585347] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> > > EFLAGS: 00210246
> > > [ÂÂÂÂ5.587600] CR0: 80050033 CR2: ffd14000 CR3: 0ce94000 CR4:
> > > 000406d0
> > > [ÂÂÂÂ5.589747] Kernel panic - not syncing: Fatal exception
> > > [ÂÂÂÂ5.590740] Kernel Offset: 0xb200000 from 0xc1000000
> > > (relocation range: 0xc0000000-0xf77fdfff)
> > > [ÂÂÂÂ5.590740] ---[ end Kernel panic - not syncing: Fatal
> > > exception ]---
> > >
> > >
> > >
> > >
> > This admittedly is a long shot, but does the appended patch help?
> Thanks for the patch, but:
> Nope, same crash.
>

thanks for the report, please confirm if the following patch fixes the
problem or not
https://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux.git/commit
/?h=next&id=3c587768271e9c20276522025729e4ebca51583b

thanks,
rui