Re: [PATCH RFT 6/6] serial: sh-sci: Increment the runtime usage counter for the earlycon device

From: Geert Uytterhoeven
Date: Thu Dec 19 2024 - 09:31:24 EST


Hi Claudiu,

On Wed, Dec 4, 2024 at 4:58 PM Claudiu <claudiu.beznea@xxxxxxxxx> wrote:
> From: Claudiu Beznea <claudiu.beznea.uj@xxxxxxxxxxxxxx>
>
> In the sh-sci driver, serial ports are mapped to the sci_ports[] array,
> with earlycon mapped at index zero.
>
> The uart_add_one_port() function eventually calls __device_attach(),
> which, in turn, calls pm_request_idle(). The identified code path is as
> follows:
>
> uart_add_one_port() ->
> serial_ctrl_register_port() ->
> serial_core_register_port() ->
> serial_core_port_device_add() ->
> serial_base_port_add() ->
> device_add() ->
> bus_probe_device() ->
> device_initial_probe() ->
> __device_attach() ->
> // ...
> if (dev->p->dead) {
> // ...
> } else if (dev->driver) {
> // ...
> } else {
> // ...
> pm_request_idle(dev);
> // ...
> }
>
> The earlycon device clocks are enabled by the bootloader. However, the
> pm_request_idle() call in __device_attach() disables the SCI port clocks
> while earlycon is still active.
>
> The earlycon write function, serial_console_write(), calls
> sci_poll_put_char() via serial_console_putchar(). If the SCI port clocks
> are disabled, writing to earlycon may sometimes cause the SR.TDFE bit to
> remain unset indefinitely, causing the while loop in sci_poll_put_char()
> to never exit. On single-core SoCs, this can result in the system being
> blocked during boot when this issue occurs.
>
> To resolve this, increment the runtime PM usage counter for the earlycon
> SCI device before registering the UART port.
>
> Fixes: 0b0cced19ab1 ("serial: sh-sci: Add CONFIG_SERIAL_EARLYCON support")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@xxxxxxxxxxxxxx>

Thanks for your patch!

> --- a/drivers/tty/serial/sh-sci.c
> +++ b/drivers/tty/serial/sh-sci.c
> @@ -3435,7 +3435,24 @@ static int sci_probe_single(struct platform_device *dev,
> sciport->port.flags |= UPF_HARD_FLOW;
> }
>
> + /*
> + * In case:
> + * - this is the earlycon port (mapped on index 0 in sci_ports[]) and
> + * - it now maps to an alias other than zero and
> + * - the earlycon is still alive (e.g., "earlycon keep_bootcon" is
> + * available in bootargs)
> + *
> + * we need to avoid disabling clocks and PM domains through the runtime
> + * PM APIs called in __device_attach(). For this, increment the runtime
> + * PM reference counter (the clocks and PM domains were already enabled
> + * by the bootloader). Otherwise the earlycon may access the HW when it
> + * has no clocks enabled leading to failures (infinite loop in
> + * sci_poll_put_char()).
> + */
> +
> if (sci_ports[0].earlycon && sci_ports[0].port.mapbase == sci_res->start) {

Now there are two tests for mapbase: here and in sci_probe()...

> + pm_runtime_get_noresume(&dev->dev);
> +
> /*
> * Skip cleanup up the sci_port[0] in early_console_exit(), this
> * port is the same as the earlycon one.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds