[PATCH 5.17 0737/1126] serial: 8250: Fix race condition in RTS-after-send handling

From: Greg Kroah-Hartman
Date: Tue Apr 05 2022 - 05:53:33 EST


From: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx>

[ Upstream commit dedab69fd650ea74710b2e626e63fd35584ef773 ]

Set em485->active_timer = NULL isn't always enough to take out the stop
timer. While there is a check that it acts in the right state (i.e.
waiting for RTS-after-send to pass after sending some chars) but the
following might happen:

- CPU1: some chars send, shifter becomes empty, stop tx timer armed
- CPU0: more chars send before RTS-after-send expired
- CPU0: shifter empty irq, port lock taken
- CPU1: tx timer triggers, waits for port lock
- CPU0: em485->active_timer = &em485->stop_tx_timer, hrtimer_start(),
releases lock()
- CPU1: get lock, see em485->active_timer == &em485->stop_tx_timer,
tear down RTS too early

This fix bases on research done by Steffen Trumtrar.

Fixes: b86f86e8e7c5 ("serial: 8250: fix potential deadlock in rs485-mode")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx>
Link: https://lore.kernel.org/r/20220215160236.344236-1-u.kleine-koenig@xxxxxxxxxxxxxx
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
drivers/tty/serial/8250/8250_port.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index 973870ebff69..fd0339d22491 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -1623,6 +1623,18 @@ static inline void start_tx_rs485(struct uart_port *port)
struct uart_8250_port *up = up_to_u8250p(port);
struct uart_8250_em485 *em485 = up->em485;

+ /*
+ * While serial8250_em485_handle_stop_tx() is a noop if
+ * em485->active_timer != &em485->stop_tx_timer, it might happen that
+ * the timer is still armed and triggers only after the current bunch of
+ * chars is send and em485->active_timer == &em485->stop_tx_timer again.
+ * So cancel the timer. There is still a theoretical race condition if
+ * the timer is already running and only comes around to check for
+ * em485->active_timer when &em485->stop_tx_timer is armed again.
+ */
+ if (em485->active_timer == &em485->stop_tx_timer)
+ hrtimer_try_to_cancel(&em485->stop_tx_timer);
+
em485->active_timer = NULL;

if (em485->tx_stopped) {
--
2.34.1