Re: [PATCH v2 2/7] serial: qcom-geni: fix shutdown race

From: Doug Anderson
Date: Fri Oct 11 2024 - 10:30:58 EST


Hi,

On Thu, Oct 10, 2024 at 11:51 PM Johan Hovold <johan@xxxxxxxxxx> wrote:
>
> > > Not sure how your "console process" works, but this should only happen
> > > if you do not enable the serial console (console=ttyMSM0) and then try
> > > to use a polled console (as enabling the console will prevent port
> > > shutdown from being called).
> >
> > That simply doesn't seem to be the case for me. The port shutdown
> > seems to be called. To confirm, I put a printout at the start of
> > qcom_geni_serial_shutdown(). I see in my /proc/cmdline:
> >
> > console=ttyMSM0,115200n8
> >
> > ...and I indeed verify that I see console messages on my UART. I then run:
> >
> > stop console-ttyMSM0
> >
> > ...and I see on the UART:
> >
> > [ 92.916964] DOUG: qcom_geni_serial_shutdown
> > [ 92.922703] init: console-ttyMSM0 main process (611) killed by TERM signal
> >
> > Console messages keep coming out the UART even though the agetty isn't
> > there.
>
> And this is with a Chromium kernel, not mainline?

Who do you take me for?!?! :-P :-P :-P Of course it's with mainline.


> If you take a look at tty_port_shutdown() there's a hack in there for
> consoles that was added back in 2010 and that prevents shutdown() from
> called for console ports.
>
> Put perhaps you manage to hit shutdown() via some other path. Serial
> core is not yet using tty_port_hangup() so a hangup might trigger
> that...
>
> Could you check that with a dump_stack()?

Sure. Typed from the agetty itself, here ya go. Git hash is not a
mainline git hash because I have your patches applied. "dirty" is
because of the printout / dump_stack().

lazor-rev9 ~ # stop console-ttyMSM0
[ 68.772828] DOUG: qcom_geni_serial_shutdown
[ 68.777365] CPU: 2 UID: 0 PID: 589 Comm: login Not tainted
6.12.0-rc1-g0bde0d120d58-dirty #1
ac8ed1a05abcc73f4fafe0543cbc76768ea594e1
[ 68.789737] Hardware name: Google Lazor (rev9) with LTE (DT)
[ 68.795581] Call trace:
[ 68.798124] dump_backtrace+0xf8/0x120
[ 68.802025] show_stack+0x24/0x38
[ 68.805463] dump_stack_lvl+0x40/0xc8
[ 68.809265] dump_stack+0x18/0x38
[ 68.812702] qcom_geni_serial_shutdown+0x38/0x110
[ 68.817578] uart_port_shutdown+0x48/0x68
[ 68.821736] uart_shutdown+0xcc/0x170
[ 68.825530] uart_hangup+0x54/0x158
[ 68.829154] __tty_hangup+0x20c/0x318
[ 68.832954] tty_vhangup_session+0x20/0x38
[ 68.837195] disassociate_ctty+0xe8/0x1a8
[ 68.841355] do_exit+0x10c/0x358
[ 68.844716] do_group_exit+0x9c/0xa8
[ 68.848441] get_signal+0x408/0x4d8
[ 68.852071] do_signal+0xa8/0x770
[ 68.855526] do_notify_resume+0x78/0x118
[ 68.859605] el0_svc+0x64/0x68
[ 68.862790] el0t_64_sync_handler+0x20/0x128
[ 68.867218] el0t_64_sync+0x1a8/0x1b0
[ 68.872933] init: console-ttyMSM0 main process (589) killed by TERM signal


> > Now I (via ssh) drop into the debugger:
> >
> > echo g > /proc/sysrq-trigger
> >
> > I see the "kgdb" prompt but I can't interact with it because
> > qcom_geni_serial_shutdown() stopped RX.
>
> How about simply amending poll_get_char() so that it enables the
> receiver if it's not already enabled?

Yeah, this would probably work.

-Doug