Re: Serial data loss

From: gianluca
Date: Tue Apr 07 2020 - 05:01:25 EST


Hello,
I am very pleased the Mr. Greg Kroah-Hartman is writing to me in person!

I appreciate a lot sir!

On 04/07/2020 10:24 AM, Greg Kroah-Hartman wrote:
On Tue, Apr 07, 2020 at 09:30:21AM +0200, gianluca wrote:
I have a BIG trouble having dataloss when using two internal serial ports of
my boards based on NXP/FreeScale iMX28 SoC ARMv5Te ARM920ej-s architecture.

It runs at 454Mhz.

Kernel used 4.9.x

That's a very old kernel, you are going to have to get support for that
from the vendor you bought it from :(


We are the vendor. ;-)

Jokes apart, I can try to use the latest kernel 5.6, and see how is going on them, but at the first check the driver seems exactly the same as in kernel 4.9.

When using my test case unit software between two serial ports connect each
other by a null modem cable, it fails when the speed rate are different,

Of course, how would that work?


I am not native english speaker so I am misleading to a misunderstanding: my test case is a software with two pthreads which the main thread is working with a differnet baud rate than the other pthread. Using the same software in two different machines, and using the same baudrate for each corrispondant port it should work.

i.e. /dev/ttyAPP1 is running at 9600 and /dev/ttyAPP2 is running at 38400

The same in the other machine. Both ports are null-modem connected:

9600 /dev/ttyAPP1 <----> /dev/ttyAPP1 9600
38400 /dev/ttyAPP2 <----> /dev/ttyAPP2 38400

I hope to be clear now. ;-)

and
dataloss is increasing higher the speed rate.

What type of flow control are you using?


Unfortunately no flow control. Because the I cannot use it. When connected to the real-hardware those two ports are connected to a microcontroller unit which does not have flow control, only RX & TX connected (i.e. no RTS/CTS/DTE/DCE lines)

I suppose to have overruns (now I am modifying my software to check them
too), but I think it is due the way the ISR is called and all data are
passed to the uart circular buffer within the interrupt routine.

Are you using flow control?


As above, no [ unfortunately ]


I am talking about the high latency from the IRQ up to the service routine
when flushing the FIFO and another IRQ is called by another uart in the same
time at different speed.

The code I was looking is: drivers/tty/serial/mxs-auart.c __but__ all other
serial drivers are acting in the same way: they are reading one character at
time from the FIFO (if it exists) and put it into the circular buffer so
serial/tty driver can pass them to the user read routine.

Each function call has some overhead and it is time-consuming, and if
another interrupt is invoked by the same UART Core but from another serial
port (different context) the continuos insertion done by hardware UART into
the FIFO cannot be served fast enough to have an overrun. I think this can
be applied __almost__ to every serial driver as they are written in the same
way.

And it is __NOT__ an issue because of the CPU and its speed! Using two
serial converter (FTDI and Prolific PL2303 based) on each board, the problem
does not appear at all even after 24 hours running at more than 115200!!!

usb-serial devices are totally different and send data to the host in a
completly different way.

Your hardware might just not be able to handle really high baud rates at
a continous stream, what baud rate were you using?


I suppose that, but the same issue can be proven with all single core (NO FIFO UART) processors using two ports on the same uart core, running Linux kernel @ 450 Mhz or less.

The irq latency it is the same.

And again, this is what flow control was designed for, please use it.


I know and usually I am using a sort of protocol which can check correctness of packet, and if not, the packet has to be reasked/resent.
In this case the microcontroller board I am connected to is not built by us, and the software is a custom protocol (and I do not know if an error on transfer can be accomplished by another request).

So the flow control __CANNOT_BE_USED_AT_ALL__...

It does work fine if I am using two different serial devices: one internal
uart (mxs-auart) and an external uart (ttyUSB).

Again, different interrupt and protocols being used for the USB stuff.


...and in our case is working better than the internal uart driver on the same board. It is a real pity...

thanks,


Thanks to you, mr. greg k-h!

greg k-h


P.S.: I am a very close friend of Andrea Arcangeli, we grew up in the same place, and we went in the same school here in Italy (Imola - bologna).

We used to talked about you last Christmas Holidays when Andrea came to Italy from NY

Regards,
Gianluca Renzi
--
Eurek s.r.l. |
Electronic Engineering | http://www.eurek.it
via Celletta 8/B, 40026 Imola, Italy | Phone: +39-(0)542-609120
p.iva 00690621206 - c.f. 04020030377 | Fax: +39-(0)542-609212