Re: [BUG] kgdb/ftrace - sleeping in invalid context

From: Jason Wessel
Date: Thu Nov 17 2016 - 15:06:36 EST


On 11/17/2016 01:16 PM, Brian Norris wrote:
Hi,

I've been poking around KGDB, and I noticed that the KDB 'ftdump'
command (to dump ftrace logs) produces warnings like this:

(gdb) monitor ftdump
Dumping ftrace buffer:
BUG: sleeping function called from invalid context at mm/slab.h:359
in_atomic(): 1, irqs_disabled(): 128, pid: 116, name: irq/65-chromeos
CPU: 4 PID: 116 Comm: irq/65-chromeos Not tainted 4.4.21 #575
Call trace:
[<ffffffc0002087dc>] dump_backtrace+0x0/0x160
[<ffffffc00020895c>] show_stack+0x20/0x28
[<ffffffc0004bdc28>] dump_stack+0x90/0xb0
[<ffffffc0002486d8>] ___might_sleep+0x140/0x14c
[<ffffffc000248764>] __might_sleep+0x80/0x90
[<ffffffc00034ace0>] kmem_cache_alloc_trace+0x5c/0x238
[<ffffffc0002d3ab8>] ring_buffer_read_prepare+0x4c/0xa4
[<ffffffc0002f3214>] kdb_ftdump+0x200/0x3e4
[<ffffffc0002c6244>] kdb_parse+0x548/0x628
[<ffffffc0002c17a8>] gdb_serial_stub+0x89c/0xaac
[<ffffffc0002bf954>] kgdb_cpu_enter+0x1e0/0x5b0
[<ffffffc0002c002c>] kgdb_handle_exception+0x1a0/0x1e4
[<ffffffc000213004>] kgdb_compiled_brk_fn+0x30/0x3c
[<ffffffc000201a18>] brk_handler+0x9c/0xb0
[<ffffffc0002004d4>] do_debug_exception+0x60/0xe0
Exception stack(0xffffffc0ecffb820 to 0xffffffc0ecffb950)
b820: ffffffc0010a6000 0000008000000000 ffffffc0ecffba10 ffffffc0002bf020
b840: 00000000600001c5 0000000000000000 ffffffc0ecffb870 0000000000018160
b860: 0000000000000003 00000000000000c3 ffffffc000be15b9 ffffffc000be7053
b880: 0000000000000005 ffffffc0010a6c48 ffffffc0ecffb930 ffffffc00026f8e8
b8a0: ffffffc000c362ca ffffffc00108c000 ffffffc00026f880 0000000000000001
b8c0: 0000000000000007 ffffffc0ecfb6600 0000000000000001 cb88537fdc8ba615
b8e0: ffffffc0011b6430 0000000000000001 0000000000000000 ffffffc0011b6438
b900: 0000000000000000 0000000000000000 0000000000000006 ffffffc0010cc1c0
b920: ffffffc00043eed8 7f7f7f7fffffffff 681f39616369ff46 7f7f7f7f7f7f7f7f
b940: 0101010101010101 0000000000000008
[<ffffffc0002035d4>] el1_dbg+0x18/0x74
[<ffffffc0002bf0c0>] sysrq_handle_dbg+0x54/0x5c
[<ffffffc00054e5dc>] __handle_sysrq+0xa4/0x15c
[<ffffffc00054e7ec>] sysrq_filter+0x11c/0x348
[<ffffffc00069abb8>] input_to_handler+0x60/0x100
[<ffffffc00069c6d0>] input_pass_values.part.2+0x78/0x144
[<ffffffc00069e408>] input_handle_event+0x280/0x4d4
[<ffffffc00069e6cc>] input_event+0x70/0x8c
[...]

It looks like (almost?) all KDB code gets run in an exception context,
so I don't see how sleeping allocations (such as those in
ring_buffer_read_prepare()) are supposed to be able to work if they
don't immediately find enough memory.

I can't think of a great simple fix for this, other than borrowing the
hack from kdb_private.h:

#define GFP_KDB (in_interrupt() ? GFP_ATOMIC : GFP_KERNEL)

AFAICT, the necessary allocations are all pretty small actually, and so
GFP_ATOMIC may not be a problem, even in the non-KDB ring buffer case.

Thoughts? I figured I'd ask questions before blindly sending patches, as
I'm not very familiar with this code.

In the past some buffers had be pre-allocated outside of the exception context in order to properly dump the ftrace code. Perhaps these patches were lost or the interface changed slightly over time. Certainly we should never perform allocations while in the exception context.

Cheers,
Jason.