Re: UART/TTY console deadlock
From: Sergey Senozhatsky
Date: Tue Jun 30 2020 - 06:55:18 EST
On (20/06/30 12:21), Petr Mladek wrote:
> > So... Do we need to hold uart->port when we disable port->irq? What do we
> > race with? Module removal? The function bumps device PM counter (albeit
> > for UART_CAP_RPM ports only).
>
> Honestly, I do not see where a PM counter gets incremented.
serial8250_do_startup()
serial8250_rpm_get()
pm_runtime_get_sync(p->port.dev)
But this does not happen for all ports, just for UART_CAP_RPM ones.
> Anyway, __disable_irq_nosync() does nothing when
> irq_get_desc_buslock() returns NULL. And irq_get_desc_buslock()
> takes desc->lock when desc exist. This should be enough to
> synchronize any calls.
>
> > But, at the same time, we do a whole bunch
> > of unprotected port->FOO accesses in serial8250_do_startup(). We even set
> > the IRQF_SHARED up->port.irqflags without grabbing the port->lock:
> >
> > up->port.irqflags |= IRQF_SHARED;
> > spin_lock_irqsave(&port->lock, flags);
> > if (up->port.irqflags & IRQF_SHARED)
> > disable_irq_nosync(port->irq);
>
> Yup, this looks suspicious. We set a flag in port.irqflags and take the lock
> only when the flag was set. Either everything needs to be done under
> the lock or the lock is not needed.
>
> Well, I might have missed something. I do not fully understand meaning
> and relation of all the structures.
>
> Anyway, I believe that this is a false positive. If I get it correctly
> serial8250_do_startup() must be called before the serial port could
> be registered as a console. It means that it could not be called
> from inside printk().
>From my understanding, I'm afraid we are talking about actual deadlock
here, not about false positive report. Quoting the original email:
: We are trying an S3 suspend stress test and occasionally while
: entering S3 we get a console deadlock.
[..]
> > drivers/tty/serial/8250/8250_port.c | 11 +++++++----
> > 1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> > index d64ca77d9cfa..ad30991e1b3b 100644
> > --- a/drivers/tty/serial/8250/8250_port.c
> > +++ b/drivers/tty/serial/8250/8250_port.c
> > @@ -2275,6 +2275,11 @@ int serial8250_do_startup(struct uart_port *port)
> >
> > if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
> > unsigned char iir1;
> > + bool irq_shared = up->port.irqflags & IRQF_SHARED;
> > +
> > + if (irq_shared)
> > + disable_irq_nosync(port->irq);
> > +
> > /*
> > * Test for UARTs that do not reassert THRE when the
> > * transmitter is idle and the interrupt has already
> > @@ -2284,8 +2289,6 @@ int serial8250_do_startup(struct uart_port *port)
> > * allow register changes to become visible.
> > */
> > spin_lock_irqsave(&port->lock, flags);
> > - if (up->port.irqflags & IRQF_SHARED)
> > - disable_irq_nosync(port->irq);
> >
> > wait_for_xmitr(up, UART_LSR_THRE);
> > serial_port_out_sync(port, UART_IER, UART_IER_THRI);
> > @@ -2297,9 +2300,9 @@ int serial8250_do_startup(struct uart_port *port)
> > iir = serial_port_in(port, UART_IIR);
> > serial_port_out(port, UART_IER, 0);
> >
> > - if (port->irqflags & IRQF_SHARED)
> > - enable_irq(port->irq);
> > spin_unlock_irqrestore(&port->lock, flags);
> > + if (irq_shared)
> > + enable_irq(port->irq);
> >
> > /*
> > * If the interrupt is not reasserted, or we otherwise
>
> I think that it might be safe but I am not 100% sure, sigh.
Yeah, I'm not 100%, but I'd give it a try.
-ss