Re: smpboot: CPU numbers printed as warning

From: Petr Mladek
Date: Tue Feb 16 2021 - 04:52:43 EST


On Mon 2021-02-15 20:22:34, Paul Menzel wrote:
> Dear Linux folks,
>
>
> Using Linux 5.10.13 (and before), looking at the Linux kernel warnings, the
> CPU numbers show up. For example with 12 cpus/threads:
>
> ```
> $ sudo dmesg --level=warn
> [ 0.216103] #2
> [ 0.220105] #3
> [ 0.224103] #4
> [ 0.228104] #5
> [ 0.232110] #6
> [ 0.236101] #7
> [ 0.240102] #8
> [ 0.244102] #9
> [ 0.248100] #10
> [ 0.252098] #11
> ```

Is this the exact output from sudo dmesg --level=warn?
It is strange that each CPU number is printed on itw own line.

Anyway, it might be affected by the new lockless ringbuffer.
The original code decided whether to connect the lines only by
"current" task pointer. The lockless ring buffer takes into account
also CPU number.

Well, it has never been reliable. For example, I see here:

<6>[ 0.238262][ T1] smp: Bringing up secondary CPUs ...
<6>[ 0.239340][ T1] x86: Booting SMP configuration:
<6>[ 0.239794][ T1] .... node #0, CPUs: #1
<6>[ 0.113946][ T0] kvm-clock: cpu 1, msr 6ba01041, secondary cpu clock
<6>[ 0.113946][ T0] smpboot: CPU 1 Converting physical 0 to logical die 1
<6>[ 0.246056][ T16] kvm-guest: stealtime: cpu 1, msr 17f9f2080
<4>[ 0.246679][ T1] #2
<6>[ 0.113946][ T0] kvm-clock: cpu 2, msr 6ba01081, secondary cpu clock
<6>[ 0.113946][ T0] smpboot: CPU 2 Converting physical 0 to logical die 2
<6>[ 0.250023][ T21] kvm-guest: stealtime: cpu 2, msr 17fbf2080
<4>[ 0.250648][ T1] #3
<6>[ 0.113946][ T0] kvm-clock: cpu 3, msr 6ba010c1, secondary cpu clock
<6>[ 0.113946][ T0] smpboot: CPU 3 Converting physical 0 to logical die 3
<6>[ 0.254026][ T26] kvm-guest: stealtime: cpu 3, msr 17fdf2080
<6>[ 0.254568][ T1] smp: Brought up 1 node, 4 CPUs
<6>[ 0.254597][ T1] smpboot: Max logical packages: 4
<6>[ 0.255097][ T1] smpboot: Total of 4 processors activated (16896.11 BogoMIPS)

There are another messages printed in between that obviously break pr_cont().

> If I am not mistaken, this is from `announce_cpu()` in
> `arch/x86/kernel/smpboot.c`, and the `pr_cont()` in their “forgets” the log
> level it belongs to, maybe because of SMP and other messages are printed in
> between.
>
> ```
> if (system_state < SYSTEM_RUNNING) {
> if (node != current_node) {
> if (current_node > (-1))
> pr_cont("\n");
> current_node = node;
>
> printk(KERN_INFO ".... node %*s#%d, CPUs: ",
> node_width - num_digits(node), " ", node);
> }
>
> /* Add padding for the BSP */
> if (cpu == 1)
> pr_cont("%*s", width + 1, " ");
>
> pr_cont("%*s#%d", width - num_digits(cpu), " ", cpu);
>
> } else
> pr_info("Booting Node %d Processor %d APIC 0x%x\n",
> node, cpu, apicid);
> ```

> Would using `pr_info()` instead be an acceptable fix?

Makes sense to me.

Also you should add '\n' into the previous string to make the behavior
clear. It will always be printed on a new line when pr_info()
is used.

Best Regards,
Petr