Re: [PATCH 2/2] fbcon: Defer console takeover for splash screens to first switch

From: Daniel van Vugt
Date: Mon Feb 26 2024 - 20:15:19 EST


On 27/2/24 02:23, Hans de Goede wrote:
> Hi All,
>
> On 2/2/24 09:53, Daniel van Vugt wrote:
>> Until now, deferred console takeover only meant defer until there is
>> output. But that risks stepping on the toes of userspace splash screens,
>> as console messages may appear before the splash screen. So check for the
>> "splash" parameter (as used by Plymouth) and if present then extend the
>> deferral until the first switch.
>
> Daniel, thank you for your patch but I do not believe that this
> is the right solution. Deferring fbcon takeover further then
> after the first text is output means that any errors about e.g.
> a corrupt initrd or the kernel erroring out / crashing will not
> be visible.

That's not really correct. If a boot failure has occurred after the splash then
pressing escape shows the log. If a boot failure has occurred before the splash
then it can be debugged visually by rebooting without the "splash" parameter.

>
> When the kernel e.g. oopses or panics because of not finding
> its rootfs (I tested the latter option when writing the original
> deferred fbcon takeover code) then fbcon must takeover and
> print the messages from the dying kernel so that the user has
> some notion of what is going wrong.

Indeed, just reboot without the "splash" parameter to do that.

>
> And since your patch seems to delay switching till the first
> vc-switch this means that e.g. even after say gdm refusing
> to start because of issues there still will be no text
> output. This makes debugging various issues much harder.

I've debugged many gdm failures and it is never useful to use the console for
those. Reboot and get the system journal instead.

>
> Moreover Fedora has been doing flickerfree boot for many
> years without needing this.

I believe Fedora has a mostly working solution, but not totally reliable, as
mentioned in the commit message:

"even systems whose splash exists in initrd may not be not immune because they
still rely on racing against all possible kernel messages that might
trigger the fbcon takeover"

>
> The kernel itself will be quiet as long as you set
> CONFIG_CONSOLE_LOGLEVEL_QUIET=3 Ubuntu atm has set this
> to 4 which means any kernel pr_err() or dev_err()
> messages will get through and since there are quite
> a few false positives of those Ubuntu really needs
> to set CONFIG_CONSOLE_LOGLEVEL_QUIET=3 to fix part of:
> https://bugs.launchpad.net/bugs/1970069

Incorrect. In my testing some laptops needed log level as low as 2 to go quiet.
And the Ubuntu kernel team is never going to fix all those for non-sponsored
devices.

>
> After that it is "just" a matter of not making userspace
> output anything unless it has errors to report.
>
> systemd already is quiet by default (only logging
> errors) when quiet is on the kernel commandline.

Unfortunately not true for Ubuntu. We carry a noisy systemd patch which I'm
told we can't remove in the short term:

https://bugs.launchpad.net/ubuntu/+source/plymouth/+bug/1970069/comments/39

>
> So any remaining issues are Ubuntu specific boot
> process bits and Ubuntu really should be able to
> make those by silent unless they have important
> info (errors or other unexpected things) to report.
>
> Given that this will make debugging boot issues
> much harder and that there are other IMHO better
> alternatives I'm nacking this patch: NACK.
>
> FWIW I believe that I'm actually saving Ubuntu
> from shooting themselves in the foot here,
> hiding all sort of boot errors (like the initrd
> not finding /) until the user does a magic
> alt+f2 followed by alt+f1 incantation really is
> not doing yourself any favors wrt debugging any
> sort of boot failures.
>
> Regards,
>
> Hans

Thanks for your input, but I respectfully disagree and did consider these
points already.

- Daniel

>
>
>
>
>
>> Closes: https://bugs.launchpad.net/bugs/1970069
>> Cc: Mario Limonciello <mario.limonciello@xxxxxxx>
>> Signed-off-by: Daniel van Vugt <daniel.van.vugt@xxxxxxxxxxxxx>
>> ---
>> drivers/video/fbdev/core/fbcon.c | 32 +++++++++++++++++++++++++++++---
>> 1 file changed, 29 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c
>> index 63af6ab034..5b9f7635f7 100644
>> --- a/drivers/video/fbdev/core/fbcon.c
>> +++ b/drivers/video/fbdev/core/fbcon.c
>> @@ -76,6 +76,7 @@
>> #include <linux/crc32.h> /* For counting font checksums */
>> #include <linux/uaccess.h>
>> #include <asm/irq.h>
>> +#include <asm/cmdline.h>
>>
>> #include "fbcon.h"
>> #include "fb_internal.h"
>> @@ -146,6 +147,7 @@ static inline void fbcon_map_override(void)
>>
>> #ifdef CONFIG_FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER
>> static bool deferred_takeover = true;
>> +static int initial_console = -1;
>> #else
>> #define deferred_takeover false
>> #endif
>> @@ -3341,7 +3343,7 @@ static void fbcon_register_existing_fbs(struct work_struct *work)
>> console_unlock();
>> }
>>
>> -static struct notifier_block fbcon_output_nb;
>> +static struct notifier_block fbcon_output_nb, fbcon_switch_nb;
>> static DECLARE_WORK(fbcon_deferred_takeover_work, fbcon_register_existing_fbs);
>>
>> static int fbcon_output_notifier(struct notifier_block *nb,
>> @@ -3358,6 +3360,21 @@ static int fbcon_output_notifier(struct notifier_block *nb,
>>
>> return NOTIFY_OK;
>> }
>> +
>> +static int fbcon_switch_notifier(struct notifier_block *nb,
>> + unsigned long action, void *data)
>> +{
>> + struct vc_data *vc = data;
>> +
>> + WARN_CONSOLE_UNLOCKED();
>> +
>> + if (vc->vc_num != initial_console) {
>> + dummycon_unregister_switch_notifier(&fbcon_switch_nb);
>> + dummycon_register_output_notifier(&fbcon_output_nb);
>> + }
>> +
>> + return NOTIFY_OK;
>> +}
>> #endif
>>
>> static void fbcon_start(void)
>> @@ -3370,7 +3387,14 @@ static void fbcon_start(void)
>>
>> if (deferred_takeover) {
>> fbcon_output_nb.notifier_call = fbcon_output_notifier;
>> - dummycon_register_output_notifier(&fbcon_output_nb);
>> + fbcon_switch_nb.notifier_call = fbcon_switch_notifier;
>> + initial_console = fg_console;
>> +
>> + if (cmdline_find_option_bool(boot_command_line, "splash"))
>> + dummycon_register_switch_notifier(&fbcon_switch_nb);
>> + else
>> + dummycon_register_output_notifier(&fbcon_output_nb);
>> +
>> return;
>> }
>> #endif
>> @@ -3417,8 +3441,10 @@ void __exit fb_console_exit(void)
>> {
>> #ifdef CONFIG_FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER
>> console_lock();
>> - if (deferred_takeover)
>> + if (deferred_takeover) {
>> dummycon_unregister_output_notifier(&fbcon_output_nb);
>> + dummycon_unregister_switch_notifier(&fbcon_switch_nb);
>> + }
>> console_unlock();
>>
>> cancel_work_sync(&fbcon_deferred_takeover_work);
>