Re: [PATCH] kgdboc: Be a bit more robust about handling earlycon leaving
From: Daniel Thompson
Date: Fri May 01 2020 - 07:49:50 EST
On Thu, Apr 30, 2020 at 09:59:09AM -0700, Douglas Anderson wrote:
> The original implementation of kgdboc_earlycon included a comment
> about how it was impossible to get notified about the boot console
> going away without making changes to the Linux core. Since folks
> often don't want to change the Linux core for kgdb's purposes, the
> kgdboc_earlycon implementation did a bit of polling to figure out when
> the boot console went away.
>
> It turns out, though, that it is possible to get notified about the
> boot console going away. The solution is either clever or a hack
> depending on your viewpoint. ...or, perhaps, a clever hack. All we
> need to do is head-patch the "exit" routine of the boot console. We
> know that "struct console" must be writable because it has a "next"
> pointer in it, so we can just put our own exit routine in, do our
> stuff, and then call back to the original.
I think I'm in the hack camp on this one!
> This works great to get notified about the boot console going away.
> The (slight) problem is that in the context of the boot console's exit
> routine we can't call tty_find_polling_driver().
I presume this is something to do with the tty_mutex?
> We solve this by
> kicking off some work on the system_wq when we get notified and this
> works pretty well.
There are some problems with the workqueue approach.
Firstly, its runs too early on many systems (esp. those that register
the console from a console initcall. kgdboc cannot find the tty, I think
because the console is registered a long time before the corresponding
tty comes up.
Secondly I am seeing warnings related to the tty_mutex where the
might_sleep() machinery ends up flushing the active workqueue.
[ 39.298332] ------------[ cut here ]------------
[ 39.298332] WARNING: CPU: 0 PID: 5 at kernel/workqueue.c:3033 __flush_work+00
[ 39.298332] Modules linked in:
[ 39.298332] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.7.0-rc3+ #47
[ 39.298332] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-204
[ 39.298332] Workqueue: events kgdboc_earlycon_exit_work_fn
[ 39.298332] RIP: 0010:__flush_work+0x19c/0x1c0
[ 39.298332] Code: 4c 8b 6d 20 e9 06 ff ff ff 41 c6 04 24 00 fb 45 31 f6 eb 8f
[ 39.298332] RSP: 0018:ffff993500033dd0 EFLAGS: 00010246
[ 39.298332] RAX: 0000000000000000 RBX: ffffffffadcd0720 RCX: 0000000000000001
[ 39.298332] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffadcd0820
[ 39.298332] RBP: ffff8a633ec299c0 R08: 0000000000000000 R09: 0000000000000001
[ 39.298332] R10: 000000000000000a R11: f000000000000000 R12: 00000000ffffffed
[ 39.298332] R13: ffff8a633e408840 R14: 0000000000000000 R15: ffff8a633e408840
[ 39.298332] FS: 0000000000000000(0000) GS:ffff8a633ec00000(0000) knlGS:00000
[ 39.298332] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 39.298332] CR2: ffff8a6333201000 CR3: 0000000032a0a000 CR4: 00000000000006f0
[ 39.298332] Call Trace:
[ 39.298332] ? _cond_resched+0x10/0x20
[ 39.298332] ? mutex_lock+0x9/0x30
[ 39.298332] ? tty_find_polling_driver+0x134/0x1a0
[ 39.298332] configure_kgdboc+0x12d/0x1c0
[ 39.298332] kgdboc_earlycon_exit_work_fn+0x1a/0x40
[ 39.298332] process_one_work+0x1d3/0x380
[ 39.298332] worker_thread+0x45/0x3c0
[ 39.298332] kthread+0xf6/0x130
[ 39.298332] ? process_one_work+0x380/0x380
[ 39.298332] ? kthread_park+0x80/0x80
[ 39.298332] ret_from_fork+0x22/0x40
[ 39.298332] ---[ end trace 1190f578d6e11204 ]---
[ 39.298338] KGDB: Unregistered I/O driver kgdboc_earlycon, debugger disabled
Daniel.