RE: [PATCH 2/2] serial: sc16is7xx: set TX FIFO trigger level to half FIFO to prevent underruns

From: Maarten Brock

Date: Thu Jun 25 2026 - 09:02:28 EST


> From: David Laight <david.laight.linux@xxxxxxxxx>
> Sent: Tuesday 23 June 2026 20:42
>
> On Tue, 23 Jun 2026 19:13:28 +0200
> Paul Mbewe <paultyson.mbewe@xxxxxxxxxxxxxx> wrote:
>
> > Hi David,
> >
> > On the test system, the relevant threads already run with RT priority:
> >
> >   irq/134-spi2.0     SCHED_FIFO priority 50
> >   sc16is7xx          SCHED_FIFO priority 50
> >
> > So this is not caused by the IRQ thread running as a normal SCHED_OTHER
> > task. That does not remove all latency sources, of course; on this
> > single-core SPI system the threaded IRQ can still be affected by other
> > RT/kernel activity, IRQ/preemption-disabled sections, SPI transfer time,
> > or lock contention.
> >
> > I agree, changing the TX trigger from 8 to 32 free spaces reduces the
> > per-interrupt time-to-empty margin. With an 8-space trigger the FIFO
> > still contains 56 bytes when THRI asserts, while with a 32-space trigger
> > it contains 32 bytes. So the v1 commit message is misleading when it
> > describes this as increasing the refill window.

I would say plain wrong and as such cannot be accepted.

> > The observed effect is instead that the 8-space trigger causes many
> > small TX refill events. Each event has roughly the same cost, as you
> > said. If the handler runs in time, it can catch up by seeing more than
> > 8 free spaces and writing more data. The failure happens when one event
> > is delayed long enough for the FIFO to drain.

True.

> > Using a 32-space trigger reduces the number of refill events and the
> > associated IRQ/SPI load. It also reduces the chance that one delayed
> > event lets the FIFO drain completely.

Not true. The necessary delay to trigger a drained FIFO is reduced. So the
chance that such a delay occurs is increased.

> > On the tested setup this reduced
> > irq/134-spi2.0 CPU usage from about 15-17% to about 5%, sys CPU from
> > about 51-61% to about 19-28%, and load average from about 2.0-2.2 to
> > about 0.65-1.3. With that change, the observed TX gaps disappeared.
>
> Even with those figures (and the cpu % differences seem reasonable)
> I still don't see why it stops the underruns.

It probably stops the underruns because the lower CPU load from the IRQ
gives the CPU more time to handle the overhead and other high priority
tasks. It seems like the scheduling overhead is bigger than the actual
interrupt handling.

> >
> > So I agree the commit message should be reworked to describe this as
> > reducing TX refill events and IRQ/SPI load, not as increasing the
> > per-interrupt latency margin.
> >
> > If changing the default trigger globally is considered too broad, I can
> > also look at making the TX trigger configurable or limiting the change to
> > SPI-backed devices.

I consider this to be the right approach. The user is the one that knows
about the platform and its properties (#cores, cpu freq, scheduling latency,
spi freq, uart baudrate).

Still, I think it is good to change the default to halfway the fifo depth.
It reduces the interrupt frequency by a factor 4 (8/32) at the cost of a
less than halved allowed latency (32/56).

And this is probably equally useful for the RX FIFO when data comes in at
full speed. With a halved allowed latency before overruns happen the
interrupt frequency can also be divided by 4 resulting in an similar lower
cpu load.

> >
> > Thanks
> > Paul

Kind regards,
Maarten