RE: [PATCH] tty: sifive: Finish transmission before changing the clock

From: Yash Shah
Date: Wed Mar 11 2020 - 08:22:11 EST


> -----Original Message-----
> From: linux-riscv <linux-riscv-bounces@xxxxxxxxxxxxxxxxxxx> On Behalf Of
> Palmer Dabbelt
> Sent: 07 March 2020 09:57
> To: Paul Walmsley <paul.walmsley@xxxxxxxxxx>
> Cc: Palmer Dabbelt <palmerdabbelt@xxxxxxxxxx>; Greg KH
> <gregkh@xxxxxxxxxxxxxxxxxxx>; jslaby@xxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; Palmer Dabbelt <palmer@xxxxxxxxxxx>; linux-
> serial@xxxxxxxxxxxxxxx; Paul Walmsley <paul.walmsley@xxxxxxxxxx>; linux-
> riscv@xxxxxxxxxxxxxxxxxxx; kernel-team@xxxxxxxxxxx
> Subject: [PATCH] tty: sifive: Finish transmission before changing the clock
>
> From: Palmer Dabbelt <palmerdabbelt@xxxxxxxxxx>
>
> SiFive's UART has a software controller clock divider that produces the final
> baud rate clock. Whenever the clock that drives the UART is changed this
> divider must be updated accordingly, and given that these two events are
> controlled by software they cannot be done atomically.
> During the period between updating the UART's driving clock and internal
> divider the UART will transmit a different baud rate than what the user has
> configured, which will probably result in a corrupted transmission stream.
>
> The SiFive UART has a FIFO, but due to an issue with the programming
> interface there is no way to directly determine when the UART has finished
> transmitting. We're essentially restricted to dead reckoning in order to figure
> that out: we can use the FIFO's TX busy register to figure out when the last
> frame has begun transmission and just delay for a long enough that the last
> frame is guaranteed to get out.
>
> As far as the actual implementation goes: I've modified the existing existing
> clock notifier function to drain both the FIFO and the shift register in on
> PRE_RATE_CHANGE. As far as I know there is no hardware flow control in
> this UART, so there's no good way to ask the other end to stop transmission
> while we can't receive (inserting software flow control messages seems like a
> bad idea here).
>
> Signed-off-by: Palmer Dabbelt <palmerdabbelt@xxxxxxxxxx>
> ---
> I have not tested this, as I don't have any hardware. I'm also not even
> remotely familiar with the serial subsystem, so I don't know if there's a
> better way of going about this. I'm specifically worried about a udelay() that
> could be quite long. Maybe some sort of "delay for short times, sleep for
> long times" approach would be better?
>
> I don't know if this manifests in practice on existing hardware when running
> real workloads, but I'd be willing to bet that it would be possible to trigger
> the bug on purpose as by my calculations there's about a 10k cycle window in
> which the clock can't change. IIRC there's a lot of instability when changing
> the clock frequency on the HiFive Unleashed so I doubt people are going to
> stumble across the issue regularly in practice.
>
> drivers/tty/serial/sifive.c | 28 ++++++++++++++++++++++++----
> 1 file changed, 24 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/tty/serial/sifive.c b/drivers/tty/serial/sifive.c index
> d5f81b98e4d7..d34031e842d0 100644
> --- a/drivers/tty/serial/sifive.c
> +++ b/drivers/tty/serial/sifive.c
> @@ -618,10 +618,10 @@ static void sifive_serial_shutdown(struct uart_port
> *port)
> *
> * On the V0 SoC, the UART IP block is derived from the CPU clock source
> * after a synchronous divide-by-two divider, so any CPU clock rate change
> - * requires the UART baud rate to be updated. This presumably could
> corrupt any
> - * serial word currently being transmitted or received. It would probably
> - * be better to stop receives and transmits, then complete the baud rate
> - * change, then re-enable them.
> + * requires the UART baud rate to be updated. This presumably corrupts
> + any
> + * serial word currently being transmitted or received. In order to
> + avoid
> + * corrupting the output data stream, we drain the transmit queue
> + before
> + * allowing the clock's rate to be changed.
> */
> static int sifive_serial_clk_notifier(struct notifier_block *nb,
> unsigned long event, void *data) @@ -
> 629,6 +629,26 @@ static int sifive_serial_clk_notifier(struct notifier_block
> *nb,
> struct clk_notifier_data *cnd = data;
> struct sifive_serial_port *ssp = notifier_to_sifive_serial_port(nb);
>
> + if (event == PRE_RATE_CHANGE) {
> + /*
> + * The TX watermark is always set to 1 by this driver, which
> + * means that the TX busy bit will lower when there are 0
> bytes
> + * left in the TX queue -- in other words, when the TX FIFO is
> + * empty.
> + */
> + __ssp_wait_for_xmitr(ssp);
> + /*
> + * On the cycle the TX FIFO goes empty there is still a full
> + * UART frame left to be transmitted in the shift register.
> + * The UART provides no way for software to directly
> determine
> + * when that last frame has been transmitted, so we just
> sleep
> + * here instead. As we're not tracking the number of stop
> bits
> + * they're just worst cased here. The rest of the serial
> + * framing parameters aren't configurable by software.
> + */
> + udelay(DIV_ROUND_UP(12 * 1000 * 1000, ssp->baud_rate));
> + }
> +
> if (event == POST_RATE_CHANGE && ssp->clkin_rate != cnd-
> >new_rate) {
> ssp->clkin_rate = cnd->new_rate;
> __ssp_update_div(ssp);
> --
> 2.25.1.481.gfbce0eb801-goog
>

A quick test on HiFive Unleashed board showed some improvements.
Prior to this patch, I have been observing some random corrupted characters on serial console when continuously changing the CPU clock rate.
After applying this patch I don't see those corrupted characters anymore while changing the clock rate.

Tested-by: Yash Shah <yash.shah@xxxxxxxxxx>

This observation is based on a quick initial test on HiFive Unleashed. I am planning to further test it by inducing the error on purpose. Will try to update the result soon.

- Yash