Re: VIOLENT 2.1.129 serial bug on Alpha.

Maciej W. Rozycki (macro@ds2.pg.gda.pl)
Tue, 24 Nov 1998 20:31:00 +0100 (MET)


On Mon, 23 Nov 1998, Theodore Y. Ts'o wrote:

> I think this patch should solve the problem. It's sounds similar to a
> problem reported to me by Maciej W. Rozycki last week. If so, it's a
> long-standing serial driver problem which occurs only if you are using
> it as a module.
>
> There are a few cases where the module use count gets decremented more
> than it should. Why this causes a oops afterwards I'm not entirely
> sure, but I looked over the code paths which Maciej pointed out to me,
> and sure enough, they do over-decrement the module use count.
[...]
> RCS file: drivers/char/RCS/serial.c,v
> retrieving revision 1.1
> diff -u -r1.1 drivers/char/serial.c
> --- drivers/char/serial.c 1998/11/20 17:27:15 1.1
> +++ drivers/char/serial.c 1998/11/20 17:29:03
> @@ -2581,10 +2581,8 @@
> }
> tty->driver_data = info;
> info->tty = tty;
> - if (serial_paranoia_check(info, tty->device, "rs_open")) {
> - MOD_DEC_USE_COUNT;
> + if (serial_paranoia_check(info, tty->device, "rs_open"))
> return -ENODEV;
> - }
>
> #ifdef SERIAL_DEBUG_OPEN
> printk("rs_open %s%d, count = %d\n", tty->driver.name, info->line,
> @@ -2631,7 +2629,6 @@
>
> retval = block_til_ready(tty, filp, info);
> if (retval) {
> - MOD_DEC_USE_COUNT;
> #ifdef SERIAL_DEBUG_OPEN
> printk("rs_open returning after block_til_ready with %d\n",
> retval);

Shouldn't all seven references to MOD_DEC_USE_COUNT within rs_open be
removed? As I see from tty_open (in tty_io.c), rs_close is always called
after an unsuccessful rs_open. As rs_close always calls
MOD_DEC_USE_COUNT, there is no need (and it's even harmful) to do this
from rs_open.

The relevant part of tty_io.c starts at line 1314:

if (tty->driver.open)
retval = tty->driver.open(tty, filp);

then:

if (retval) {
/* the printk removed */
release_dev(filp);

and release_dev calls:

if (tty->driver.close)
tty->driver.close(tty, filp);

So I believe if you remove only two of MOD_DEC_USE_COUNT references,
there will still exist failure paths, even if unlikely to happen.

Am I right?

I'll provide a patch if so.

BTW, the Oops is obvious -- I have two processes using a serial module,
say gpm and getty. Gpm runs OK, but getty is blocked since it waits for
modem lines to become active. Thus the usage count is two. Now if getty
gets killed, the usage count drops to zero and the next turnaround of
rmmod -a removes the module. Next time gpm tries to talk to the serial
port (or a mouse interrupt arrives) it causes tty_io to reference a
non-existing serial module...

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/