Re: WARNING at: drivers/char/tty_ldisc.c

From: Linus Torvalds
Date: Sun Aug 02 2009 - 20:40:56 EST




On Sun, 2 Aug 2009, Alan Cox wrote:
> >
> > So exactly what _does_ happen if we get rid of that hack?
>
> Serial console breaks if I remember rightly because the hangup takes out
> the port and printk can't then use the resources that were attached to
> it.

Hmm. That sounds like a likely explanation for the hack, but we do
re-initialize the tty sufficiently that I don't see why the serial console
would be unable to use it - it's not like we go through the "filp" anyway.

So it must be something about /dev/console itself, not so much the serial
lines and serial consoles.

I wonder if the hack is strictly necessary, or could perhaps at least be
moved down a bit. The comment around that area seems to imply there were
other issues ("will cause tty->count and state->count to go out of sync"),
and that whole tty->count thing is some seriously old code, and the tty
layer has changed a lot since 1992.

So I do get the feeling that it may be just old code that simply nobody
has dared remove - it may have made more sense back when then it does
now, and has just been carried around.

If it actually were to cause problems with the tty->count thing, I think
Sergey should have seen it thanks to us still doing that old
CHECK_TTY_COUNT thing.

> redirected_tty_write should I think count for the redirection to
> hung_up_tty_fops, but I'm not sure you want to do the hangup instead of
> close.

I'll try with a serial console on one of my machines to see if I can see
anything wrong. It _would_ be nice to get rid of that thing, since it
clearly causes problems.

> You could just finish the ldisc refcounting. The last set of patches you
> had off me split tty->ldisc from struct tty ready to do exactly that and
> I don't think there is anything left that stops it happening now (It was
> just not ready in time)

I considered it, and it didn't look horrible (the thing really is pretty
self-contained in tty_ldisc_try() and tty_ldisc_deref()). But the counts
are off-by-one (ie zero means "one user - the tty itself"), and it really
isn't the kind of thing I'd like to do surgery on after -rc5 (or even
after -rc1, for that matter).

But yeah, it _should_ be possible to just get rid of tty_ldisc_wait_idle()
entirely if we were to just make tty_ldisc_deref() free the ldisc, and
started the ldisc->refcount from 1. Then "tty_ldisc_put()" should be just
a regular tty_ldisc_deref() (it would _normally_ be the one that makes it
go to zero and frees the 'struct ldisc', unless there are outstanding
references).

So it doesn't look bad. It's just that if it turns out that console
hackery isn't necessary, I'd much rather get rid of that.

And right now it looks like the console hackery not only isn't necessary,
but is actively the thing that breaks - the "odd" thing about the
single-user shutdown is exactly the fact that it opens /dev/console
instead of a regular tty.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/