Re: e1000_netpoll() , BUG: sleeping function called from invalid context

From: Gabriel C
Date: Fri Feb 17 2017 - 18:16:18 EST




On 17.02.2017 23:38, Cong Wang wrote:
On Fri, Feb 17, 2017 at 1:20 PM, Gabriel C <nix.or.die@xxxxxxxxx> wrote:
Hi all,

while poking at a different issue I found the following on my logs :

[85362.132770] BUG: sleeping function called from invalid context at
kernel/irq/manage.c:110
[85362.132771] in_atomic(): 1, irqs_disabled(): 1, pid: 1153, name:
systemd-journal
[85362.132772] no locks held by systemd-journal/1153.
[85362.132772] irq event stamp: 60088359
[85362.132777] hardirqs last enabled at (60088359): [<ffffffff810d07c2>]
vprintk_emit+0x432/0x470
[85362.132779] hardirqs last disabled at (60088358): [<ffffffff810d03ec>]
vprintk_emit+0x5c/0x470
[85362.132782] softirqs last enabled at (60088258): [<ffffffff810688fd>]
__do_softirq+0x22d/0x290
[85362.132784] softirqs last disabled at (60088233): [<ffffffff81068c0a>]
irq_exit+0x6a/0xd0
[85362.132784] Preemption disabled at:
[85362.132787] [<ffffffff815203de>] write_msg+0x4e/0xf0
[85362.132790] CPU: 0 PID: 1153 Comm: systemd-journal Tainted: G I
4.10.0-rc8-debug-00001-ga1015e374d94-dirty #5
[85362.132791] Hardware name: FUJITSU PRIMERGY
TX200 S5 /D2709, BIOS 6.00 Rev. 1.14.2709
02/04/2013
[85362.132792] Call Trace:
[85362.132796] dump_stack+0x86/0xc1
[85362.132799] ___might_sleep+0x213/0x230
[85362.132801] __might_sleep+0x6b/0x80
[85362.132803] synchronize_irq+0x33/0x90
[85362.132805] ? __irq_put_desc_unlock+0x19/0x40
[85362.132807] ? __disable_irq_nosync+0x4e/0x60
[85362.132808] disable_irq+0x17/0x20


Hmm, your kernel base version is 4.10.0-rc8 but the symbols here
look like prior to my commit, because with my commit here should
be disable_hardirq() calling synchronize_hardirq().

Did you revert it or make any local changes?

The kernel is -rc8 with reverted d966564fcdc19e13eb6ba1fbe6b8101070339c3d
+ http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=202461e2f3c15dbfb05825d29ace0d20cdf55fa4
+ an debug patch from Thomas to find these goldfish issues.
( http://ftp.frugalware.org/pub/other/people/crazy/kernel/t/goldfish-debug.patch )

No other changes..




[85362.132810] e1000_netpoll+0x3d/0x110
[85362.132813] netpoll_poll_dev+0xa0/0x170
[85362.132814] netpoll_send_skb_on_dev+0x1ab/0x2b0
[85362.132816] netpoll_send_udp+0x417/0x450
[85362.132817] write_msg+0xdb/0xf0
[85362.132819] console_unlock+0x2e5/0x430
[85362.132821] ? __down_trylock_console_sem+0x41/0x60
[85362.132822] vprintk_emit+0x456/0x470
[85362.132825] printk_emit+0x2e/0x36
[85362.132827] ? simple_strtoull+0x2c/0x50
[85362.132829] devkmsg_write+0x115/0x160
[85362.132831] do_iter_readv_writev+0xd8/0x100
[85362.132833] do_readv_writev+0xdf/0x220
[85362.132835] ? __might_fault+0x37/0x90
[85362.132838] ? entry_SYSCALL_64_fastpath+0x5/0xc2
[85362.132839] vfs_writev+0x3a/0x50
[85362.132841] do_writev+0x4c/0xd0
[85362.132842] SyS_writev+0xb/0x10
[85362.132843] entry_SYSCALL_64_fastpath+0x1f/0xc2
[85362.132845] RIP: 0033:0x7fc6d305c46d
[85362.132846] RSP: 002b:00007ffca94161e0 EFLAGS: 00000293 ORIG_RAX:
0000000000000014
[85362.132847] RAX: ffffffffffffffda RBX: ffffffff813afad3 RCX:
00007fc6d305c46d
[85362.132848] RDX: 0000000000000005 RSI: 00007ffca94162f0 RDI:
0000000000000006
[85362.132849] RBP: ffffc9000d4c3f88 R08: 0000000000000000 R09:
0000000000000008
[85362.132850] R10: 0000000000000069 R11: 0000000000000293 R12:
00007ffca94162f0
[85362.132851] R13: 000055768b19c920 R14: 00007ffca9416330 R15:
00007ffca94163c0
[85362.132853] ? __this_cpu_preempt_check+0x13/0x20


Seems to be introduced by :

commit 311191297125156319be8f86d546ea1c569f7e95
Author: WANG Cong <xiyou.wangcong@xxxxxxxxx>
Date: Sat Dec 10 14:22:42 2016 -0800

e1000: use disable_hardirq() for e1000_netpoll()

...

The used config can be found there:

http://ftp.frugalware.org/pub/other/people/crazy/kernel/t/config-rcx


Regards,

Gabriel C