Re: SMP lockup at boot on Freescale/NXP T2080 (powerpc 64)

From: Christophe Leroy
Date: Thu Aug 08 2019 - 09:12:07 EST




Le 08/08/2019 Ã 10:46, Christophe Leroy a ÃcritÂ:


Le 07/08/2019 Ã 03:24, Chris Packham a ÃcritÂ:
On Wed, 2019-08-07 at 11:13 +1000, Michael Ellerman wrote:
Chris Packham <Chris.Packham@xxxxxxxxxxxxxxxxxxx> writes:

On Tue, 2019-08-06 at 21:32 +1000, Michael Ellerman wrote:
The difference between a working and non working defconfig is
CONFIG_PREEMPT specifically CONFIG_PREEMPT=y makes my system hang
at
boot.

Is that now intentionally prohibited on 64-bit powerpc?
It's not prohibitied, but it probably should be because no one really
tests it properly. I have a handful of IBM machines where I boot a
PREEMPT kernel but that's about it.

The corenet configs don't have PREEMPT enabled, which suggests it was
never really supported on those machines.

But maybe someone from NXP can tell me otherwise.


I think our workloads needÂCONFIG_PREEMPT=y because our systems have
switch ASIC drivers implemented in userland and we need to be able to
react quickly to network events in order to prevent loops. We have seen
instances of this not happening simply because some other process is in
the middle of a syscall.

One thing I am working on here is a setup with a few vendor boards and
some of our own kit that we can test the upstream kernels on. Hopefully
that'd make these kinds of reports more timely rather than just
whenever we decide to move to a new kernel version.




The defconfig also sets CONFIG_DEBUG_PREEMPT. Have you tried without CONFIG_DEBUG_PREEMPT ?


Reproduced on QEMU. CONFIG_DEBUG_PREEMPT is the trigger. Due to smp_processor_id() being called from early_init_this_mmu(), when CONFIG_DEBUG_PREEMPT is set debug_smp_processor_id() is called instead of raw_smp_processor_id(), but this is too early for debug_smp_processor_id()

As this call is useless, just drop it.

Can you test patch at https://patchwork.ozlabs.org/patch/1144005/ ?

Thanks
Christophe