Re: [PATCH] vt: Fix sleeping functions called from atomic context

From: Fabio M. De Francesco
Date: Thu Nov 18 2021 - 12:01:31 EST


On Thursday, November 18, 2021 1:14:59 PM CET Tetsuo Handa wrote:
> On 2021/11/18 18:38, Fabio M. De Francesco wrote:

> If ->flow.lock were held from only schedulable context, replacing this spinlock with
> a mutex would be possible. But stop_tty() says "may be called from any context" which
> means that we can't use a mutex...
>
> Making do_con_write() no-op when called with IRQs disabled would be the minimal change
> that can silence the syzbot. But this does not fix the regression for drivers/tty/n_hdlc.c
> introduced by f9e053dcfc02b0ad.
>
> --- a/drivers/tty/vt/vt.c
> +++ b/drivers/tty/vt/vt.c
> @@ -2902,7 +2902,7 @@ static int do_con_write(struct tty_struct *tty, const unsigned char *buf, int co
> struct vt_notifier_param param;
> bool rescan;
>
> - if (in_interrupt())
> + if (in_interrupt() || irqs_disabled())
> return count;
>
> console_lock();
> @@ -3358,7 +3358,7 @@ static void con_flush_chars(struct tty_struct *tty)
> {
> struct vc_data *vc;
>
> - if (in_interrupt()) /* from flush_to_ldisc */
> + if (in_interrupt() || irqs_disabled()) /* from flush_to_ldisc */
> return;
>
> /* if we race with con_close(), vt may be null */
>

For what my opinion is worth, I like the solution by Tetsuo that is reported above.
The bug is real and I suppose that it must be addressed. This seems the most straightforward
and effective way to fix it.

Regards,

Fabio M. De Francesco

> According to scripts/get_maintainer.pl , Greg and Jiri are maintainers for the n_hdlc driver.
> Greg and Jiri, what do you think? Is sacrificing ability to write to consoles when
> n_hdlc_send_frames() is called from __start_tty() path considered tolerable? (Maybe
> OK for now and stable kernels, for nobody was reporting this problem suggests that
> nobody depends on this ability.)
>
> But if we must fix the regression for drivers/tty/n_hdlc.c , we need to use something
> like https://lkml.kernel.org/r/7d851c88-f657-dfd8-34ab-4891ac6388dc@xxxxxxxxxxxxxxxxxxx
> in order to achieve what f9e053dcfc02b0ad meant. That is, extend tty->stopped in order
> to be able to represent "started state (currently indicated by tty->stopped == false)",
> "stopped state (currently indicated by tty->stopped == true)" and "changing state
> (currently impossible due to bool)", but this approach might need to touch many locations,
> and I worry that touching many locations introduces some oversight bugs.
>
>