On Mon, Oct 30, 2017 at 12:18:35AM +0100, Fengguang Wu wrote:This is not as important for polling sources as it is for the interrupt sources since polling sources
CC related developers for the BUG in v4.14-rc6.Looks like Tyler broke it:
On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
Hi Linus,Here is the dmesg fragment:
Up to now we see the below boot error/warnings when testing v4.14-rc6.
They hit the RC release mainly due to various imperfections in 0day's
auto bisection. So I manually list them here and CC the likely easy to
debug ones to the corresponding maintainers in the followup emails.
boot_successes: 4700
boot_failures: 247
BUG:kernel_hang_in_test_stage: 152
BUG:kernel_reboot-without-warning_in_test_stage: 10
BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c: 1
BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/rwsem.c: 3
BUG:sleeping_function_called_from_invalid_context_at_mm/page_alloc.c: 21
[ 47.597981] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x26d34d96462, max_idle_ns: 440795289520 ns
[ 48.626601] clocksource: Switched to clocksource tsc
[ 49.273620] ERST: Error Record Serialization Table (ERST) support is initialized.
[ 49.290288] pstore: using zlib compression
[ 49.299588] pstore: Registered erst as persistent store backend
[ 49.311408] BUG: sleeping function called from invalid context at mm/page_alloc.c:4150
[ 49.312031] in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: swapper/0
[ 49.312031] CPU: 37 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc6 #1
[ 49.312031] Hardware name: Intel Corporation S2600WP/S2600WP, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[ 49.312031] Call Trace:
[ 49.312031] dump_stack+0x63/0x86
[ 49.312031] ___might_sleep+0xf1/0x110
[ 49.312031] __might_sleep+0x4a/0x80
[ 49.312031] __alloc_pages_nodemask+0x14e/0x270
[ 49.312031] alloc_page_interleave+0x17/0x80
[ 49.312031] alloc_pages_current+0xc8/0xe0
[ 49.312031] __get_free_pages+0xe/0x40
[ 49.312031] pte_alloc_one_kernel+0x15/0x20
[ 49.312031] __pte_alloc_kernel+0x1d/0x100
[ 49.312031] ioremap_page_range+0x330/0x3a0
[ 49.312031] ghes_copy_tofrom_phys+0x182/0x2b0
[ 49.312031] ghes_read_estatus+0x76/0x140
[ 49.312031] ghes_proc+0x1c/0x130
[ 49.312031] ghes_probe+0x157/0x430
[ 49.312031] platform_drv_probe+0x3b/0xa0
[ 49.312031] driver_probe_device+0x29c/0x450
[ 49.312031] __driver_attach+0xdf/0xf0
[ 49.312031] ? driver_probe_device+0x450/0x450
[ 49.312031] bus_for_each_dev+0x60/0xa0
[ 49.312031] driver_attach+0x1e/0x20
[ 49.312031] bus_add_driver+0x170/0x260
[ 49.312031] ? set_debug_rodata+0x17/0x17
[ 49.312031] driver_register+0x60/0xe0
[ 49.312031] __platform_driver_register+0x36/0x40
[ 49.312031] ghes_init+0x10f/0x199
[ 49.312031] ? bert_init+0x215/0x215
[ 49.312031] do_one_initcall+0x43/0x170
[ 49.312031] ? set_debug_rodata+0x17/0x17
[ 49.312031] kernel_init_freeable+0x198/0x220
[ 49.312031] ? rest_init+0xd0/0xd0
[ 49.312031] kernel_init+0xe/0x101
[ 49.312031] ret_from_fork+0x25/0x30
[ 49.670116] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC.
[ 49.691436] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[ 49.729954] 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[ 49.767235] Non-volatile memory driver v1.3
[ 49.778363] Linux agpgart interface v0.103
77b246b32b2c ("acpi: apei: check for pending errors when probing GHES entries")
and it went into 4.13 and -stable.
Tyler, why is it so important to do the polling immediately upon
registration? Can't we wait until the polling does it?