Re: Bug in Linux serial driver (and fix proposal)

Wojciech Zabolotny (wzab@ise.pw.edu.pl)
Tue, 5 Oct 1999 10:04:09 +0200


On Mon, Oct 04, 1999 at 03:46:00PM -0400, Theodore Y. Ts'o wrote:
> Date: Mon, 4 Oct 1999 20:56:10 +0200
> From: Wojciech Zabolotny <wzab@ise.pw.edu.pl>
>
> I just discovered, that the user program can not detect the "overrun"
> event in serial data (at least in 2.0.x kernels, I don't know the 2.2.x).
> This is a serious bug in the serial driver, if it is used for applications
> sensitive to data loss (eg. processing of data from AD converter connected
> to the serial port).
>
> It's true that it would be nice if applications where informed about the
> data loss, but most of the time there's not much they can do after the
> fact, after data has been lost.

In my application (it is an "intelligent" A/D converter sending the data to
the PC through RS at 115200) the detection of overrun is really essential.
The user should be at least warned about such event, and if possible, the
baudrate should be lowered (it can be done automatically by the application,
and the break in the data stream may be marked in the recorded data).

The application shall be used in the hospitals, which due to permanent lack
of money use very old hardware - so RS are offen 16450 based (no input
FIFO), IDE controllers use only PIO mode and so on.
I'm pretty sure, that there are more people who experience that problem.

I think that the first proposed solution (marking of overrun in the data
stream, when PARMRK flag is set) does not increase seriously the size of
driver's code, and should be safe for other applications (the sequence
0xff 0x01 has no meaning when PARMRK flag is used, so can be used for
marking of overrun event).

> That's why currently the overrun case is treated as a system log; it's
> an indication that system isn't setup correctly, or that there's a
> kernel problem. In practice overruns should never happen. The most
> common case of overruns is due to the IDE device driver taking too
> long. Read the man page about hdparm's -u option, and try it. It
> will likely cause your overrun problem to go away altogether.
>
> I'll think about adding your suggestions, but our time is almost
> certain better spent trying to fix why the overrun's are happening in
> the first place, rather than trying to improve ways that the application
> is notified of the fact that something had gone wrong. Let's rather
> concentrate on avoiding things from going wrong in the first place.

I agree, but Linux is often used on very old hardware (unusable with other
contemporary OS's) and I have at least one computer, where there was no way
to avoid overruns at 115200 (386/40MHz, 8MB RAM, IDE-PIO). The possibility
of using such old hardware is one of big advatages of Linux (that's why
most distributions still contain mainly binaries compiled for i386).
The information about the overrun in the logfile may be easily overlooked...

-- 
			Thanks for all the answers
	                Wojciech M. Zabolotny
			http://www.ise.pw.edu.pl/~wzab 
			wzab@ise.pw.edu.pl

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/