[BUG] kgdb/ftrace - sleeping in invalid context

From: Brian Norris
Date: Thu Nov 17 2016 - 14:16:14 EST


Hi,

I've been poking around KGDB, and I noticed that the KDB 'ftdump'
command (to dump ftrace logs) produces warnings like this:

(gdb) monitor ftdump
Dumping ftrace buffer:
BUG: sleeping function called from invalid context at mm/slab.h:359
in_atomic(): 1, irqs_disabled(): 128, pid: 116, name: irq/65-chromeos
CPU: 4 PID: 116 Comm: irq/65-chromeos Not tainted 4.4.21 #575
Call trace:
[<ffffffc0002087dc>] dump_backtrace+0x0/0x160
[<ffffffc00020895c>] show_stack+0x20/0x28
[<ffffffc0004bdc28>] dump_stack+0x90/0xb0
[<ffffffc0002486d8>] ___might_sleep+0x140/0x14c
[<ffffffc000248764>] __might_sleep+0x80/0x90
[<ffffffc00034ace0>] kmem_cache_alloc_trace+0x5c/0x238
[<ffffffc0002d3ab8>] ring_buffer_read_prepare+0x4c/0xa4
[<ffffffc0002f3214>] kdb_ftdump+0x200/0x3e4
[<ffffffc0002c6244>] kdb_parse+0x548/0x628
[<ffffffc0002c17a8>] gdb_serial_stub+0x89c/0xaac
[<ffffffc0002bf954>] kgdb_cpu_enter+0x1e0/0x5b0
[<ffffffc0002c002c>] kgdb_handle_exception+0x1a0/0x1e4
[<ffffffc000213004>] kgdb_compiled_brk_fn+0x30/0x3c
[<ffffffc000201a18>] brk_handler+0x9c/0xb0
[<ffffffc0002004d4>] do_debug_exception+0x60/0xe0
Exception stack(0xffffffc0ecffb820 to 0xffffffc0ecffb950)
b820: ffffffc0010a6000 0000008000000000 ffffffc0ecffba10 ffffffc0002bf020
b840: 00000000600001c5 0000000000000000 ffffffc0ecffb870 0000000000018160
b860: 0000000000000003 00000000000000c3 ffffffc000be15b9 ffffffc000be7053
b880: 0000000000000005 ffffffc0010a6c48 ffffffc0ecffb930 ffffffc00026f8e8
b8a0: ffffffc000c362ca ffffffc00108c000 ffffffc00026f880 0000000000000001
b8c0: 0000000000000007 ffffffc0ecfb6600 0000000000000001 cb88537fdc8ba615
b8e0: ffffffc0011b6430 0000000000000001 0000000000000000 ffffffc0011b6438
b900: 0000000000000000 0000000000000000 0000000000000006 ffffffc0010cc1c0
b920: ffffffc00043eed8 7f7f7f7fffffffff 681f39616369ff46 7f7f7f7f7f7f7f7f
b940: 0101010101010101 0000000000000008
[<ffffffc0002035d4>] el1_dbg+0x18/0x74
[<ffffffc0002bf0c0>] sysrq_handle_dbg+0x54/0x5c
[<ffffffc00054e5dc>] __handle_sysrq+0xa4/0x15c
[<ffffffc00054e7ec>] sysrq_filter+0x11c/0x348
[<ffffffc00069abb8>] input_to_handler+0x60/0x100
[<ffffffc00069c6d0>] input_pass_values.part.2+0x78/0x144
[<ffffffc00069e408>] input_handle_event+0x280/0x4d4
[<ffffffc00069e6cc>] input_event+0x70/0x8c
[...]

It looks like (almost?) all KDB code gets run in an exception context,
so I don't see how sleeping allocations (such as those in
ring_buffer_read_prepare()) are supposed to be able to work if they
don't immediately find enough memory.

I can't think of a great simple fix for this, other than borrowing the
hack from kdb_private.h:

#define GFP_KDB (in_interrupt() ? GFP_ATOMIC : GFP_KERNEL)

AFAICT, the necessary allocations are all pretty small actually, and so
GFP_ATOMIC may not be a problem, even in the non-KDB ring buffer case.

Thoughts? I figured I'd ask questions before blindly sending patches, as
I'm not very familiar with this code.

Regards,
Brian