Re: [PATCH 4.16 000/196] 4.16.4-stable review

From: Shuah Khan
Date: Mon Apr 23 2018 - 16:07:49 EST


On 04/23/2018 12:03 PM, Greg Kroah-Hartman wrote:
> On Sun, Apr 22, 2018 at 03:50:20PM +0200, Greg Kroah-Hartman wrote:
>> This is the start of the stable review cycle for the 4.16.4 release.
>> There are 196 patches in this series, all will be posted as a response
>> to this one. If anyone has any issues with these being applied, please
>> let me know.
>>
>> Responses should be made by Tue Apr 24 13:50:16 UTC 2018.
>> Anything received after that time might be too late.
>>
>> The whole patch series can be found in one patch at:
>> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.16.4-rc1.gz
>> or in the git tree and branch at:
>> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.16.y
>> and the diffstat can be found below.
>
> There is a -rc3 out now, to fix some issues reported with -rc1 and -rc2
> (I forgot to announce -rc2, sorry.):
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.16.4-rc3.gz
>
>
>

I just tried rc1 - will try rc3. Has the lock problem seen already: ( I will
try rc3 and if I setill see the problem, will start bisect)

================================
WARNING: inconsistent lock state
4.16.4-rc1+ #4 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/3/0 [HC1[1]:SC1[1]:HE0:SE0] takes:
(fs_reclaim){?.+.}, at: [< (ptrval)>] fs_reclaim_acquire.part.76+0x5/0x30
{HARDIRQ-ON-W} state was registered at:
fs_reclaim_acquire.part.76+0x29/0x30
kmem_cache_alloc_node_trace+0x39/0x2a0
alloc_worker+0x2d/0xa0
create_worker+0xa0/0x2b0
workqueue_init+0x315/0x39a
kernel_init_freeable+0x153/0x312
kernel_init+0xf/0x120
ret_from_fork+0x3a/0x50
irq event stamp: 164931
hardirqs last enabled at (164930): [< (ptrval)>] __do_softirq+0xe1/0x560
hardirqs last disabled at (164931): [< (ptrval)>] interrupt_entry+0xbd/0xf0
softirqs last enabled at (164928): [< (ptrval)>] irq_enter+0x75/0x80
softirqs last disabled at (164929): [< (ptrval)>] irq_exit+0x104/0x110

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(fs_reclaim);
<Interrupt>
lock(fs_reclaim);

*** DEADLOCK ***

no locks held by swapper/3/0.

stack backtrace:
CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.16.4-rc1+ #4
Hardware name: System76, Inc. Wild Dog Performance/H87-PLUS, BIOS 0705 12/05/2013
Call Trace:
<IRQ>
dump_stack+0x67/0x98
print_usage_bug+0x24c/0x266
mark_lock+0x6c1/0x7a0
? check_usage_backwards+0x230/0x230
__lock_acquire+0x1237/0x1e20
? debug_check_no_locks_freed+0x190/0x190
? debug_check_no_locks_freed+0x190/0x190
? memzero_explicit+0xa/0x10
? extract_buf+0x1b1/0x220
? trace_event_raw_event_xfer_secondary_pool+0x1b0/0x1b0
? _raw_spin_unlock_irqrestore+0x3e/0x50
? match_held_lock+0x1b/0x210
? lock_acquire+0xcd/0x220
lock_acquire+0xcd/0x220
? fs_reclaim_acquire.part.76+0x5/0x30
? lock_acquire+0xcd/0x220
fs_reclaim_acquire.part.76+0x29/0x30
? fs_reclaim_acquire.part.76+0x5/0x30
__kmalloc+0x51/0x2e0
? crng_reseed+0x229/0x440
crng_reseed+0x229/0x440
? init_std_data+0x1c0/0x1c0
? add_interrupt_randomness+0x27d/0x340
credit_entropy_bits+0x3b7/0x3d0
add_interrupt_randomness+0x27d/0x340
? extract_entropy.constprop.41+0x1b0/0x1b0
? rcu_read_lock_sched_held+0x7c/0x80
? __handle_irq_event_percpu+0xf9/0x3c0
handle_irq_event_percpu+0x8b/0xe0
? __handle_irq_event_percpu+0x3c0/0x3c0
? lock_acquire+0xcd/0x220
? handle_edge_irq+0x28/0x2c0
? do_raw_spin_unlock+0x91/0x120
handle_irq_event+0x5a/0x90
handle_edge_irq+0xef/0x2c0
handle_irq+0x32/0x40
do_IRQ+0x60/0x130
common_interrupt+0xf/0xf
RIP: 0010:__do_softirq+0xe7/0x560
RSP: 0018:ffff8803cfec7f58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffde
RAX: 00000000ffffffff RBX: ffff8803cd6b26c0 RCX: ffffffff8112890c
RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff8803cd6b2f24
RBP: ffff8803cd6b2ef4 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000202
? common_interrupt+0xa/0xf
? trace_hardirqs_on_caller+0x18c/0x280
? __do_softirq+0xe1/0x560
irq_exit+0x104/0x110
smp_apic_timer_interrupt+0xb8/0x300
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0010:cpuidle_enter_state+0xc4/0x430
RSP: 0018:ffff8803cd6c7d90 EFLAGS: 00000212 ORIG_RAX: ffffffffffffff12
RAX: 0000000000000000 RBX: ffffe8ffffcca660 RCX: ffffffff8112890c
RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff8803cd6b2f24
RBP: 0000000000011c2f R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
R13: ffffffff82d17658 R14: ffffffff82d17640 R15: ffffffff82d17680
? trace_hardirqs_on_caller+0x18c/0x280
do_idle+0x1a9/0x1f0
cpu_startup_entry+0xc2/0xd0
? cpu_in_idle+0x20/0x20
? _raw_spin_unlock_irqrestore+0x32/0x50
? trace_hardirqs_on_caller+0x18c/0x280
start_secondary+0x282/0x2f0
? set_cpu_sibling_map+0x840/0x840
secondary_startup_64+0xa5/0xb0
random: crng init done
device: '0:44': device_add
PM: Adding info for No Bus:0:44
device: 'lo': device_add
PM: Adding info for No Bus:lo
r8169 0000:03:00.0 eth0: link up
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
device: 'lxcbr0': device_add
PM: Adding info for No Bus:lxcbr0
IPv6: ADDRCONF(NETDEV_UP): lxcbr0: link is not ready
PM: Removing info for No Bus:lxcbr0

thanks,
-- Shuah