Re: Change of behaviour when console=null and ttynull driver is used

From: Guillermo Rodriguez Garcia
Date: Fri Mar 24 2023 - 13:02:12 EST


Hi Petr,

El vie, 24 mar 2023 a las 16:08, Petr Mladek (<pmladek@xxxxxxxx>) escribió:
>
> Hi Linus,
>
> I need an advice whether to revert the commit 3cffa06aeef7ece30f
> ("printk/console: Allow to disable console output by using
> console="" or console=null").
>
> On Wed 2023-03-22 09:33:32, Guillermo Rodriguez Garcia wrote:
> > Hi Petr,
> >
> > El vie, 17 mar 2023 a las 13:51, Petr Mladek (<pmladek@xxxxxxxx>) escribió:
> > >
> > > On Thu 2023-03-16 11:29:26, Guillermo Rodriguez Garcia wrote:
> > > > Hi all,
> > > >
> > > > We have several embedded systems where pass console= or console=null
> > > > in production to disable the console.
> > > >
> > > > Later we check for this in user space: in our inittab we check if fd0
> > > > is "associated with a terminal" (test -t 0); if so, we are in
> > > > development mode and we open a debug shell; otherwise (console
> > > > disabled) we just start the application.
> > > >
> > > > Recently [1] this behaviour has changed and now if we pass console= or
> > > > console=null, the new ttynull driver is used. This breaks the check we
> > > > were doing (test -t 0 always true now).
> > > >
> > > > [1]: https://lore.kernel.org/lkml/X%2FcDG%2FxCCzSWW2cd@alley/t/
> > >
> > > This is actually exactly the problem that the change tried to solve.
> > > Some systems failed to boot when there was no console and they tried
> > > to write something at stdout.
> >
> > Well, I did not have any problem before this change.
> >
> > I understand that some systems had a problem (many others didn't, and
> > setting console= or console=null has been standard practice for a long
> > time). Since this change in behaviour could (and did) break things in
> > user space, perhaps it should have been made opt-in...
>
> Historically, a single and invalid console=<name> kernel parameter
> might cause that no console gets registered. As a result, there
> is no stdin/stdout/stderr for the init process.
>
> From the code perspective, the <name> is considered a preferred
> console. No console is registered by default because there is
> the preferred console. But an invalid <name> never matches.
>
> The non-existing stdin/stdout/stderr caused Oopses reported during boot, see
> https://lore.kernel.org/all/20200309052915.858-1-shreyas.joshi@xxxxxxxxx/
> https://lore.kernel.org/all/20201006065907.GA528@jagdpanzerIV.localdomain/
>
> It would have been great to debug the root of the Oopses.
> But it was not trivial. And it looked like the fallback
> to ttynull made sense even if the kernel did not crash:
>
> 1. The init process might expect a working stdin/stderr/stdout.
> The POSIX standard says:
>
> At program start-up, three streams are predefined and need not be
> opened explicitly: standard input (for reading conventional input),
> standard output (for writing conventional output), and standard error
> (for writing diagnostic output). When opened, the standard error
> stream is not fully buffered; the standard input and standard output
> streams are fully buffered if and only if the stream can be determined
> not to refer to an interactive device.
>
> 2. Also the comment in init/main.c is rather clear:
>
> /* Open /dev/console, for stdin/stdout/stderr, this should never fail */
> void __init console_on_rootfs(void)
>
>
> This is why we implemented the fallback to ttynull. But it causes
> regressions on Chromebooks. They use:
>
> + console="" to disable consoles on production systems.
> (better performance, nobody reads the console anyway).
>
> + "test -f 0" to detect whether the Chromebook is in
> debugging mode.

FWIW, in my case (not a Chromebook) I was not using test -f, but test
-t in order to check whether fd 0 was "a terminal".

(In hindsight perhaps this worked because when the console is invalid,
there is no fd 0 at all).

[...]
>
> The regression did not reach normal users. Chromebook developers

This is not entirely correct; the regression not only affected
Chromebooks. I suspect it affected many embedded systems where setting
console="" or console=null is very common. For example, in U-Boot,
when you use the "silent" parameter to suppress log output (which is
common practice in production), the console parameter is automatically
set to "" (empty string). When the meaning of console="" changed, this
also caused side-effects that had to be fixed in U-Boot; see
https://github.com/u-boot/u-boot/commit/ba9aa40bb387385b8ef8b6594661a97ddcb8d04c

My point is that perhaps only Chromebook developers complained here,
but many other users were probably affected (specially on embedded
systems).

BR,

Guillermo
--
Guillermo Rodriguez Garcia
guille.rodriguez@xxxxxxxxx