Re: [PATCH] tty: Fix crash with flush_to_ldisc()

From: Michael Neuling
Date: Fri Apr 07 2017 - 00:58:29 EST


Al,

On Fri, 2017-04-07 at 05:12 +0100, Al Viro wrote:
> On Fri, Apr 07, 2017 at 01:50:53PM +1000, Michael Neuling wrote:
>
> > diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
> > index bdf0e6e899..a2a9832a42 100644
> > --- a/drivers/tty/n_tty.c
> > +++ b/drivers/tty/n_tty.c
> > @@ -1668,11 +1668,17 @@ static int
> > Ân_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp,
> > Â Âchar *fp, int count, int flow)
> > Â{
> > - struct n_tty_data *ldata = tty->disc_data;
> > + struct n_tty_data *ldata;
> > Â int room, n, rcvd = 0, overflow;
> > Â
> > Â down_read(&tty->termios_rwsem);
> > Â
> > + ldata = tty->disc_data;
> > + if (!ldata) {
> > + up_read(&tty->termios_rwsem);
>
> I very much doubt that it's correct.ÂÂIt shouldn't have been called after
> the n_tty_close(); apparently it has been.ÂÂ->termios_rwsem won't serialize
> against it, and something apparently has gone wrong with the exclusion there.
> At the very least I would like to see what's to prevent n_tty_close() from
> overlapping the exection of this function - if *that* is what broke, your
> patch will only paper over the problem.

It does seem like I'm papering over a problem.ÂWould you be happy with the patch
if we add a WARN_ON_ONCE()?

I think the problem is permanent rather than a race/transient with the disc_data
being NULL as if we read it again later, it's still NULL.

Benh and I looked at this a bunch and we did notice tty_ldisc_reinit() was being
called called without the tty lock in one location. We tried the below patch
but it didn't help (not an upstreamable patch, just a test).

There has been a few attempts are trying to fix this but none have worked for
me:
https://lkml.org/lkml/2017/3/23/569
and
https://patchwork.kernel.org/patch/9114561/

I'm not that familiar with the tty layer (and I value my sanity) so I'm
struggling to root cause it by myself.

Mikey


diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 734a635e73..121402ff25 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1454,6 +1454,9 @@ static void tty_driver_remove_tty(struct tty_driver *driver, struct tty_struct *
driver->ttys[tty->index] = NULL;
}

+extern int tty_ldisc_lock(struct tty_struct *tty, unsigned long timeout);
+extern void tty_ldisc_unlock(struct tty_struct *tty);
+
/*
* tty_reopen() - fast re-open of an open tty
* @tty - the tty to open
@@ -1466,6 +1469,7 @@ static void tty_driver_remove_tty(struct tty_driver *driver, struct tty_struct *
static int tty_reopen(struct tty_struct *tty)
{
struct tty_driver *driver = tty->driver;
+ int rc = 0;

if (driver->type == TTY_DRIVER_TYPE_PTY &&
driver->subtype == PTY_TYPE_MASTER)
@@ -1479,10 +1483,12 @@ static int tty_reopen(struct tty_struct *tty)

tty->count++;

+ tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT);
if (!tty->ldisc)
- return tty_ldisc_reinit(tty, tty->termios.c_line);
+ rc = tty_ldisc_reinit(tty, tty->termios.c_line);
+ tty_ldisc_unlock(tty);

- return 0;
+ return rc;
}

/**
diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
index d0e84b6226..3b13ff11c5 100644
--- a/drivers/tty/tty_ldisc.c
+++ b/drivers/tty/tty_ldisc.c
@@ -334,7 +334,7 @@ static inline void __tty_ldisc_unlock(struct tty_struct *tty)
ldsem_up_write(&tty->ldisc_sem);
}

-static int tty_ldisc_lock(struct tty_struct *tty, unsigned long timeout)
+int tty_ldisc_lock(struct tty_struct *tty, unsigned long timeout)
{
int ret;

@@ -345,7 +345,7 @@ static int tty_ldisc_lock(struct tty_struct *tty, unsigned long timeout)
return 0;
}

-static void tty_ldisc_unlock(struct tty_struct *tty)
+void tty_ldisc_unlock(struct tty_struct *tty)
{
clear_bit(TTY_LDISC_HALTED, &tty->flags);
__tty_ldisc_unlock(tty);