5.19 printk breaks message ordering

From: Jason A. Donenfeld
Date: Fri Jun 17 2022 - 09:23:22 EST


Hi John & folks,

In 5.19, I'm seeing some changes in message ordering / interleaving
which lead to confusion. The most obvious (and benign) example appears
on system boot, in which the "Run /init as init process" message gets
intermixed with the messages that init actually writes() to stdout. For
example, here's a snippet from build.wireguard.com:

[ 0.469732] Freeing unused kernel image (initmem) memory: 4576K
[ 0.469738] Write protecting the kernel read-only data: 10240k
[ 0.473823] Freeing unused kernel image (text/rodata gap) memory: 2044K
[ 0.475228] Freeing unused kernel image (rodata/data gap) memory: 1136K
[ 0.475236] Run /init as init process


WireGuard Test Suite on Linux 5.19.0-rc2+ x86_64


[+] Mounting filesystems...
[+] Module self-tests:
* allowedips self-tests: pass
* nonce counter self-tests: pass
* ratelimiter self-tests: pass
[+] Enabling logging...
[+] Launching tests...
[ 0.475237] with arguments:
[ 0.475238] /init
[ 0.475238] with environment:
[ 0.475239] HOME=/
[ 0.475240] TERM=linux
[+] ip netns add wg-test-46-0
[+] ip netns add wg-test-46-1

Before the "with arguments:" and such would print prior to the
"wireguard test suite on linux 5.19" banner. Now it shows after.

I see the same thing with "Freeing unused kernel image (text/rodata gap)
memory" printing interwoven into the console of my initramfs on my
laptop. And so forth.

But the bigger issue for me is that it makes it very confusing to
interpret CI results later on. Prior, I would nice a nice correlation
of:

[+] some userspace command
[ 1.2345 ] some kernel log output
[+] some userspace command
[ 1.2346 ] some kernel log output
[+] some userspace command
[ 1.2347 ] some kernel log output

Now, the kernel log outputs are all over the place and out of order with
the sequence of commands. This makes debugging issues somewhat tricky,
because post hoc ergo propter hoc winds up being a good intuition to
follow when tracking down bugs, and now the post hoc part is muddled.

I assume this is mostly caused by your threaded printk patchset, which
moves some of that printing into a worker, which means it's more
dependent on the scheduler than before. This probably has important
benefits. But it certainly makes CI and related debugging a bit
trickier as a result.

So I was wondering if there was some way to boot the kernel with a
command line option or compile-time flag that always flushes printk
messages when they're made, or does something to make the ordering a bit
more faithful.

Thanks,
Jason