Re: boot regression on ppc64 with linux 6.2

From: Andrea Righi
Date: Wed Mar 15 2023 - 02:10:32 EST


On Wed, Mar 15, 2023 at 02:30:53PM +1100, Michael Ellerman wrote:
> Andrea Righi <andrea.righi@xxxxxxxxxxxxx> writes:
> > I'm triggering the following bug when booting my qemu powerpc VM:
>
> I'm not seeing that here :/
>
> Can you give a bit more detail?
> - qemu version
> - qemu command line
> - what userspace are you using?
> - full dmesg of the failing case

Yeah, ignore this for now, it could be related to another custom patch
that I had applied (and forgot about it sorry), this one:
https://lore.kernel.org/lkml/20230119155709.20d87e35.gary@xxxxxxxxxxx/T/

That is causing other issues on ppc64, so I think it might be related to
that, I'll do more tests making sure I use a vanilla kernel.

Sorry for the noise.

-Andrea

>
> > event-sources: Unable to request interrupt 23 for /event-sources/hot-plug-events
> > WARNING: CPU: 0 PID: 1 at arch/powerpc/platforms/pseries/event_sources.c:26 request_event_sources_irqs+0xbc/0xf0
> > Modules linked in:
> > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.2.2-kc #1
> > Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1200 0xf000005 of:SLOF,HEAD pSeries
> > NIP: c000000002022eec LR: c000000002022ee8 CTR: 0000000000000000
> > REGS: c000000003483910 TRAP: 0700 Tainted: G W (6.2.2-kc)
> > MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 24483200 XER: 00000000
> > CFAR: c000000000180838 IRQMASK: 0
> > GPR00: c000000002022ee8 c000000003483bb0 c000000001a5ce00 0000000000000050
> > GPR04: c000000002437d78 c000000002437e28 0000000000000001 0000000000000001
> > GPR08: c000000002437d00 0000000000000001 0000000000000000 0000000044483200
> > GPR12: 0000000000000000 c000000002720000 c000000000012758 0000000000000000
> > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > GPR24: 0000000000000000 c0000000020033fc cccccccccccccccd c0000000000e07f0
> > GPR28: c000000000db0520 0000000000000000 c0000000fff92ac0 0000000000000017
> > NIP [c000000002022eec] request_event_sources_irqs+0xbc/0xf0
> > LR [c000000002022ee8] request_event_sources_irqs+0xb8/0xf0
> > Call Trace:
> > [c000000003483bb0] [c000000002022ee8] request_event_sources_irqs+0xb8/0xf0 (unreliable)
> > [c000000003483c40] [c000000002022fa0] __machine_initcall_pseries_init_ras_hotplug_IRQ+0x80/0xb0
> > [c000000003483c70] [c0000000000121b8] do_one_initcall+0x98/0x300
> > [c000000003483d50] [c000000002004b28] kernel_init_freeable+0x2ec/0x370
> > [c000000003483df0] [c000000000012780] kernel_init+0x30/0x190
> > [c000000003483e50] [c00000000000cf5c] ret_from_kernel_thread+0x5c/0x64
> > --- interrupt: 0 at 0x0
> >
> > I did a bisect it and it seems that the offending commit is:
> > baa49d81a94b ("powerpc/pseries: hvcall stack frame overhead")
> >
> > Reverting that and also dfecd06bc552 ("powerpc: remove
> > STACK_FRAME_OVERHEAD"), because we need to re-introduce
> > STACK_FRAME_OVERHEAD, seems to fix everything.
>
> That function doesn't make a hcall, so presumably there was some earlier
> problem which we only detect here.
>
> cheers