Re: [External] Re: [PATCH] serial: 8250: fix panic due to PSLVERR

From: John Ogness
Date: Thu Apr 03 2025 - 09:46:13 EST


On 2025-04-03, yunhui cui <cuiyunhui@xxxxxxxxxxxxx> wrote:
>>> When the PSLVERR_RESP_EN parameter is set to 1, the device generates
>>> an error response if an attempt is made to read an empty RBR (Receive
>>> Buffer Register) while the FIFO is enabled.
>>>
>>> In serial8250_do_startup, calling serial_port_out(port, UART_LCR,
>>> UART_LCR_WLEN8) triggers dw8250_check_lcr(), which invokes
>>> dw8250_force_idle() and serial8250_clear_and_reinit_fifos(). The latter
>>> function enables the FIFO via serial_out(p, UART_FCR, p->fcr).
>>> Execution proceeds to the dont_test_tx_en label:
>>> ...
>>> serial_port_in(port, UART_RX);
>>> This satisfies the PSLVERR trigger condition.
>>>
>>> Because another CPU(e.g., using printk) is accessing the UART (UART
>>> is busy), the current CPU fails the check (value & ~UART_LCR_SPAR) ==
>>> (lcr & ~UART_LCR_SPAR), causing it to enter dw8250_force_idle().
>>
>> Didn't this[0] patch resolve this exact issue?
>>
>> John Ogness
>>
>> [0] https://lore.kernel.org/lkml/20220713131722.2316829-1-vamshigajjela@xxxxxxxxxx
>
> No, these are two separate issues. This[0] patch is necessary, as
> expressed in this comment:
>
> /*
> * With PSLVERR_RESP_EN parameter set to 1, the device generates an
> * error response when an attempt to read an empty RBR with FIFO
> * enabled.
> */
>
> The current patch addresses the following scenario:
>
> cpuA is accessing the UART via printk(), causing the UART to be busy.
> cpuB follows the CallTrace path:
> -serial8250_do_startup()
> --serial_port_out(port, UART_LCR, UART_LCR_WLEN8);
> ---dw8250_serial_out32
> ----dw8250_check_lcr
> -----dw8250_force_idle (triggered by UART busy)
> ------serial8250_clear_and_reinit_fifos
> -------serial_out(p, UART_FCR, p->fcr); (enables FIFO here)
> cpuB proceeds to the dont_test_tx_en label:
> ...
> serial_port_in(port, UART_RX); //FIFO is enabled, and the UART has
> no data to read, causing the device to generate a PSLVERR error and
> panic.
>
> Our solution:
> Relevant serial_port_out operations should be placed in a critical section.
> Before reading UART_RX, check if data is available (e.g., by verifying
> the UART_LSR DR bit is set).

OK, now I see. The problem is the explicit reads of UART_RX near the end
of serial8250_do_startup().

>> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
>> index 3f256e96c722..6909c81109db 100644
>> --- a/drivers/tty/serial/8250/8250_port.c
>> +++ b/drivers/tty/serial/8250/8250_port.c
>> @@ -2264,13 +2264,16 @@ int serial8250_do_startup(struct uart_port *port)
>> * Clear the FIFO buffers and disable them.
>> * (they will be reenabled in set_termios())
>> */
>> + uart_port_lock_irqsave(port, &flags);
>> serial8250_clear_fifos(up);
>> + uart_port_unlock_irqrestore(port, flags);

Can you clarify why serial8250_clear_fifos() needs to be in a critical
section?

serial8250_do_shutdown() and do_set_rxtrig() also call
serial8250_clear_fifos() without holding the port lock.

>>> /*
>>> * Clear the interrupt registers.
>>> */
>>> - serial_port_in(port, UART_LSR);
>>> - serial_port_in(port, UART_RX);
>>> + lsr = serial_port_in(port, UART_LSR);
>>> + if (lsr & UART_LSR_DR)
>>> + serial_port_in(port, UART_RX);

Do we care about the unchecked UART_RX in serial8250_do_shutdown()?

/*
* Read data port to reset things, and then unlink from
* the IRQ chain.
*/
serial_port_in(port, UART_RX);
serial8250_rpm_put(up);

up->ops->release_irq(up);
}

Otherwise all other UART_RX reads are either checking UART_LSR_DR first
or are under the port lock.

John