Re: linux-next: boot failure in today's linux-next

From: Stephen Rothwell
Date: Mon Apr 26 2021 - 03:55:57 EST


Hi all,

On Mon, 26 Apr 2021 16:36:06 +1000 Stephen Rothwell <sfr@xxxxxxxxxxxxxxxx> wrote:
>
> Today's linux-next build (ipowerpc_pseries_le_defconfig)
> failed its qemu boot tests like this:
>
> [ 1.833361][ T1] ibmvscsi 71000003: SRP_VERSION: 16.a
> [ 1.834439][ T1] ibmvscsi 71000003: Maximum ID: 64 Maximum LUN: 32 Maximum Channel: 3
> [ 1.834683][ T1] scsi host0: IBM POWER Virtual SCSI Adapter 1.5.9
> [ 1.842605][ C0] ibmvscsi 71000003: partner initialization complete
> [ 1.844979][ C0] ibmvscsi 71000003: host srp version: 16.a, host partition qemu (0), OS 2, max io 2097152
> [ 1.845502][ C0] ibmvscsi 71000003: sent SRP login
> [ 1.845853][ C0] ibmvscsi 71000003: SRP_LOGIN succeeded
> [ 1.851447][ T1] BUG: Kernel NULL pointer dereference on write at 0x00000390
> [ 1.851577][ T1] Faulting instruction address: 0xc00000000070386c
> [ 1.852171][ T1] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 1.852324][ T1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> [ 1.852689][ T1] Modules linked in:
> [ 1.853136][ T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.12.0 #2
> [ 1.853555][ T1] NIP: c00000000070386c LR: c000000000703a6c CTR: 0000000000000000
> [ 1.853679][ T1] REGS: c0000000063a2f40 TRAP: 0380 Not tainted (5.12.0)
> [ 1.853870][ T1] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44002240 XER: 00000000
> [ 1.854305][ T1] CFAR: c000000000703a68 IRQMASK: 0
> [ 1.854305][ T1] GPR00: c000000000703a6c c0000000063a31e0 c00000000146b200 c0000000080ca800
> [ 1.854305][ T1] GPR04: c000000006067380 c00c000000020180 0000000000000024 0000000000008500
> [ 1.854305][ T1] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [ 1.854305][ T1] GPR12: 0000000000002000 c000000001640000 c000000008068508 0000000000000020
> [ 1.854305][ T1] GPR16: 0000000000000000 0000000000000024 c000000000f85f78 c000000000f0d998
> [ 1.854305][ T1] GPR20: c0000000013b59e0 0000000000000003 c0000000063a340c 0000000000000001
> [ 1.854305][ T1] GPR24: 0000000000000000 c0000000084a3000 c0000000080ca800 c00c000000020180
> [ 1.854305][ T1] GPR28: 0000000000008500 c0000000080ca800 0000000000000024 c000000006067380
> [ 1.855486][ T1] NIP [c00000000070386c] bio_add_hw_page+0x7c/0x240
> [ 1.856357][ T1] LR [c000000000703a6c] bio_add_pc_page+0x3c/0x70
> [ 1.856723][ T1] Call Trace:
> [ 1.856890][ T1] [c0000000063a31e0] [0000000000000c00] 0xc00 (unreliable)
> [ 1.857390][ T1] [c0000000063a3230] [c00000000070105c] bio_kmalloc+0x3c/0xd0
> [ 1.857514][ T1] [c0000000063a3260] [c000000000713014] blk_rq_map_kern+0x164/0x4a0
> [ 1.857630][ T1] [c0000000063a32d0] [c0000000008e17dc] __scsi_execute+0x1cc/0x270
> [ 1.857746][ T1] [c0000000063a3350] [c0000000008e7bf0] scsi_probe_and_add_lun+0x250/0xd90
> [ 1.857887][ T1] [c0000000063a34c0] [c0000000008e921c] __scsi_scan_target+0x17c/0x630
> [ 1.858007][ T1] [c0000000063a35d0] [c0000000008e9900] scsi_scan_channel+0x90/0xe0
> [ 1.858133][ T1] [c0000000063a3620] [c0000000008e9ba8] scsi_scan_host_selected+0x138/0x1a0
> [ 1.858258][ T1] [c0000000063a3670] [c0000000008e9fec] scsi_scan_host+0x2dc/0x320
> [ 1.858367][ T1] [c0000000063a3710] [c00000000091b2a0] ibmvscsi_probe+0xa70/0xa80
> [ 1.858487][ T1] [c0000000063a3800] [c0000000000eb8ac] vio_bus_probe+0x9c/0x460
> [ 1.858616][ T1] [c0000000063a38a0] [c0000000008979bc] really_probe+0x12c/0x6b0
> [ 1.858749][ T1] [c0000000063a3950] [c000000000897fd4] driver_probe_device+0x94/0x130
> [ 1.858874][ T1] [c0000000063a3980] [c00000000089896c] device_driver_attach+0x11c/0x130
> [ 1.858999][ T1] [c0000000063a39c0] [c000000000898a38] __driver_attach+0xb8/0x1a0
> [ 1.859123][ T1] [c0000000063a3a10] [c0000000008941a8] bus_for_each_dev+0xa8/0x130
> [ 1.859257][ T1] [c0000000063a3a70] [c000000000896ef4] driver_attach+0x34/0x50
> [ 1.859381][ T1] [c0000000063a3a90] [c000000000896510] bus_add_driver+0x170/0x2b0
> [ 1.859503][ T1] [c0000000063a3b20] [c000000000899b04] driver_register+0xb4/0x1c0
> [ 1.859626][ T1] [c0000000063a3b90] [c0000000000ea808] __vio_register_driver+0x68/0x90
> [ 1.859754][ T1] [c0000000063a3bb0] [c0000000010cee74] ibmvscsi_module_init+0xa4/0xdc
> [ 1.859931][ T1] [c0000000063a3bf0] [c000000000012190] do_one_initcall+0x60/0x2c0
> [ 1.860071][ T1] [c0000000063a3cc0] [c0000000010846e4] kernel_init_freeable+0x300/0x3a0
> [ 1.860207][ T1] [c0000000063a3da0] [c000000000012764] kernel_init+0x2c/0x168
> [ 1.860336][ T1] [c0000000063a3e10] [c00000000000d5ec] ret_from_kernel_thread+0x5c/0x70
> [ 1.860690][ T1] Instruction dump:
> [ 1.861072][ T1] fba10038 7cbb2b78 7c7d1b78 7cfc3b78 a1440048 2c2a0000 4082008c a13f004a
> [ 1.861328][ T1] 7c095040 40810110 e93f0008 811f0028 <e9290390> e9290050 812903d8 7d3e4850
> [ 1.863000][ T1] ---[ end trace c49ca2d91ee47d7f ]---
> [ 1.879456][ T1]
> [ 2.880941][ T1] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>
> I don't know what caused this, but it is some change since Friday.
>
> I have left it like this.

Bisections leads to commit

42fb54fbc707 ("bio: limit bio max size")

from the block tree. Reverting that commit on top of today's
linux-next allows to the boot to work again.

--
Cheers,
Stephen Rothwell

Attachment: pgpOhBBwQYX8H.pgp
Description: OpenPGP digital signature