Re: btrfs soft lockups: locks gone crazy
From: Davidlohr Bueso
Date: Fri Oct 17 2014 - 14:46:56 EST
ping?
On Tue, 2014-10-14 at 00:27 -0700, Davidlohr Bueso wrote:
> Hello,
>
> I'm getting massive amounts of cpu soft lockups in Linus's tree for
> today. This occurs almost immediately and is very reproducible in aim7
> disk workloads using btrfs:
>
> kernel:[ 559.800017] NMI watchdog: BUG: soft lockup - CPU#114 stuck for 22s! [reaim:44435]
> ...
> [ 999.800070] Modules linked in: iptable_filter(E) ip_tables(E) x_tables(E) rpcsec_gss_krb5(E) auth_rpcgss(E) nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) iscsi_ibft(E) iscsi_boot_sysfs(E) af_packet(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) ext4(E) crc16(E) mbcache(E) coretemp(E) jbd2(E) kvm_intel(E) kvm(E) crct10dif_pclmul(E) crc32_pclmul(E) iTCO_wdt(E) sb_edac(E) iTCO_vendor_support(E) joydev(E) ghash_clmulni_intel(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) pcspkr(E) edac_core(E) i2c_i801(E) lpc_ich(E) mfd_core(E) ipmi_si(E) ipmi_msghandler(E) wmi(E) processor(E) shpchp(E) acpi_pad(E) button(E) dm_mod(E) btrfs(E) xor(E) raid6_pq(E) sd_mod(E) hid_generic(E) usbhid(E) sr_mod(E) cdrom(E) mgag200(E) syscopyarea(E) sysfillrect(E) ehci_pci(E) sysimgblt(E) i2c_algo_bit(E) drm_kms_helper(E) ixgbe(E) ahci(E) ehci_hcd(E) ttm(E) mdio(E) libahci(E) ptp(E) crc32c_intel(E) mpt2sas(E) pps_core(E) usbcore(E) drm(E) libata(E) raid_class(E) usb_common(E) dca(E) scsi_transport_sas(E) sg(E) scsi_mod(E) autofs4(E)
> [ 999.800070] CPU: 53 PID: 1027 Comm: kworker/u292:3 Tainted: G EL 3.17.0-3-default+ #2
> [ 999.800070] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BIVTSDP1.86B.0044.R01.1310221705 10/22/2013
> [ 999.800070] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
> [ 999.800070] task: ffff882054e76190 ti: ffff882054e78000 task.ti: ffff882054e78000
> [ 999.800070] RIP: 0010:[<ffffffff810a2e46>] [<ffffffff810a2e46>] queue_read_lock_slowpath+0x86/0x90
> [ 999.800070] RSP: 0018:ffff882054e7ba00 EFLAGS: 00000216
> [ 999.800070] RAX: 0000000000008a64 RBX: ffff882054e7b9b0 RCX: 0000000000008a70
> [ 999.800070] RDX: 0000000000008a70 RSI: 00000000000000c0 RDI: ffff881833b49ea0
> [ 999.800070] RBP: ffff882054e7ba00 R08: ffff881833b49e64 R09: 0000000000000001
> [ 999.800070] R10: 0000000000000000 R11: 00000000ffffffff R12: ffff881047377f00
> [ 999.800070] R13: 0000000000000007 R14: ffff88203bf6d520 R15: ffff881000000000
> [ 999.800070] FS: 0000000000000000(0000) GS:ffff88207f500000(0000) knlGS:0000000000000000
> [ 999.800070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 999.800070] CR2: 00000000006a7f80 CR3: 0000000001a14000 CR4: 00000000001407e0
> [ 999.800070] Stack:
> [ 999.800070] ffff882054e7ba10 ffffffff81573cfc ffff882054e7ba70 ffffffffa033349b
> [ 999.800070] ffff882038e9ac80 ffff88203b58fd00 ffff882038cd4ed8 0000000000000000
> [ 999.800070] ffff882054e7ba68 ffff881833b49e30 ffff882057fb4800 0000000000000001
> [ 999.800070] Call Trace:
> [ 999.800070] [<ffffffff81573cfc>] _raw_read_lock+0x1c/0x30
> [ 999.800070] [<ffffffffa033349b>] btrfs_tree_read_lock+0x5b/0x120 [btrfs]
> [ 999.800070] [<ffffffffa02d1f6b>] btrfs_read_lock_root_node+0x3b/0x50 [btrfs]
> [ 999.800070] [<ffffffffa02d726a>] btrfs_search_slot+0x53a/0xab0 [btrfs]
> [ 999.800070] [<ffffffff812c84c7>] ? cpumask_next_and+0x37/0x50
> [ 999.800070] [<ffffffffa02eebf7>] btrfs_lookup_file_extent+0x37/0x40 [btrfs]
> [ 999.800070] [<ffffffffa030e1ca>] __btrfs_drop_extents+0x16a/0xdb0 [btrfs]
> [ 999.800070] [<ffffffffa03170cc>] ? __set_extent_bit+0x22c/0x550 [btrfs]
> [ 999.800070] [<ffffffffa02d1cea>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
> [ 999.800070] [<ffffffffa02fe26e>] insert_reserved_file_extent.constprop.58+0x9e/0x2f0 [btrfs]
> [ 999.800070] [<ffffffffa0303ff5>] btrfs_finish_ordered_io+0x2e5/0x600 [btrfs]
> ...
>
> [ 1011.800059] CPU: 2 PID: 1 Comm: systemd Tainted: G EL 3.17.0-3-default+ #2
> [ 1011.800059] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BIVTSDP1.86B.0044.R01.1310221705 10/22/2013
> [ 1011.800059] task: ffff880857eb8010 ti: ffff880857ebc000 task.ti: ffff880857ebc000
> [ 1011.800059] RIP: 0010:[<ffffffff810a2e46>] [<ffffffff810a2e46>] queue_read_lock_slowpath+0x86/0x90
> [ 1011.800059] RSP: 0018:ffff880857ebfce0 EFLAGS: 00000202
> [ 1011.800059] RAX: 0000000000008a64 RBX: ffff8808549cd910 RCX: 0000000000008a74
> [ 1011.800059] RDX: 0000000000008a74 RSI: 00000000000000c4 RDI: ffff881833b49ea0
> [ 1011.800059] RBP: ffff880857ebfce0 R08: ffff881833b49e64 R09: 0000000000000000
> [ 1011.800059] R10: ffffe8e7ffc4ed60 R11: 0000000000000000 R12: ffff8808549cd900
> [ 1011.800059] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880855e931e0
> [ 1011.800059] FS: 00007f12cb6ec880(0000) GS:ffff88087f840000(0000) knlGS:0000000000000000
> [ 1011.800059] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1011.800059] CR2: 00007f12cb707000 CR3: 0000002054f16000 CR4: 00000000001407e0
> [ 1011.800059] Stack:
> [ 1011.800059] ffff880857ebfcf0 ffffffff81573cfc ffff880857ebfd50 ffffffffa033349b
> [ 1011.800059] ffff880857ebfda8 ffffffff811cf72e ffff880857ebfd20 ffffffff811cbef8
> [ 1011.800059] ffff880857ebfda8 ffff881833b49e30 ffff882057fb4800 0000000000000000
> [ 1011.800059] Call Trace:
> [ 1011.800059] [<ffffffff81573cfc>] _raw_read_lock+0x1c/0x30
> [ 1011.800059] [<ffffffffa033349b>] btrfs_tree_read_lock+0x5b/0x120 [btrfs]
> [ 1011.800059] [<ffffffff811cf72e>] ? do_last+0x18e/0x11d0
> [ 1011.800059] [<ffffffff811cbef8>] ? inode_permission+0x18/0x50
> [ 1011.800059] [<ffffffffa02d1f6b>] btrfs_read_lock_root_node+0x3b/0x50 [btrfs]
> [ 1011.800059] [<ffffffffa02d726a>] btrfs_search_slot+0x53a/0xab0 [btrfs]
> [ 1011.800059] [<ffffffff811a644e>] ? kmem_cache_alloc+0x47e/0x490
> [ 1011.800059] [<ffffffffa02ff04a>] btrfs_real_readdir+0xda/0x570 [btrfs]
> [ 1011.800059] [<ffffffff811d3e53>] iterate_dir+0xa3/0x130
> [ 1011.800059] [<ffffffff811d42e9>] SyS_getdents+0x89/0x100
> [ 1011.800059] [<ffffffff811d3fb0>] ? fillonedir+0xd0/0xd0
> [ 1011.800059] [<ffffffff81574329>] system_call_fastpath+0x16/0x1b
> [ 1011.800059] Code: 07 3c ff 74 f8 66 83 47 04 01 5d c3 8b 07 3c ff 75 f8 f3 90 8b 07 3c ff 74 f8 5d c3 0f b7 47 04 66 39 c2 74 b5 f3 90 0f b7 47 04 <66> 39 c1 75 f5 eb a8 0f 1f 00 0f 1f 44 00 00 48 63 ff 55 48 8b
> 3c ff 74 f8 5d c3 0f b7 47 04 66 39 c2 74 b5 f3 90 0f b7 47 04 <66> 39 c1 75 f5 eb a8 0f 1f 00 0f 1f 44 00 00 48 63 ff 55 48 8b
>
> The closes report I've been able to found was from Fengguang:
> https://lkml.org/lkml/2014/10/3/183
> However it seems that the 'fix' (commit ea4ebde02e08) does not work in
> this case. Are there any particular semantics btrfs relies on for rw
> spinlocks that could of been broken with qrwlocks?
>
> This issue is seen as far as 3.15 (3.14 is ok), however while bisecting
> I ran into other, at first glance similar issues, also related to
> locking in btrfs. This one actually prevents the whole system from
> booting.
>
> [ 52.294136] ------------[ cut here ]------------
> [ 52.298112] kernel BUG at fs/btrfs/locking.c:269!
> [ 52.298112] invalid opcode: 0000 [#1] SMP
> [ 52.298112] Modules linked in: acpi_cpufreq(E-) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) pcspkr(E) joydev(E) sb_edac(E) ablk_helper(E) edac_core(E) cryptd(E) i2c_i801(E) lpc_ich(E) mfd_core(E) ipmi_si(E) ipmi_msghandler(E) wmi(E) shpchp(E) button(E) acpi_pad(E) processor(E) dm_mod(E) btrfs(E) xor(E) raid6_pq(E) sd_mod(E) hid_generic(E) usbhid(E) sr_mod(E) cdrom(E) mgag200(E) syscopyarea(E) ixgbe(E) sysfillrect(E) sysimgblt(E) i2c_algo_bit(E) ehci_pci(E) drm_kms_helper(E) mdio(E) ahci(E) ehci_hcd(E) libahci(E) ptp(E) ttm(E) mpt2sas(E) crc32c_intel(E) raid_class(E) pps_core(E) libata(E) drm(E) usbcore(E) usb_common(E) dca(E) scsi_transport_sas(E) sg(E) scsi_mod(E) autofs4(E)
> [ 52.298112] CPU: 70 PID: 1304 Comm: systemd-readahe Tainted: G E 3.15.0-3-default #8
> [ 52.298112] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BIVTSDP1.86B.0044.R01.1310221705 10/22/2013
> [ 52.298112] task: ffff881052f78010 ti: ffff881055964000 task.ti: ffff881055964000
> [ 52.298112] RIP: 0010:[<ffffffffa0306fed>] [<ffffffffa0306fed>] btrfs_assert_tree_read_locked.part.0+0x9/0xb [btrfs]
> [ 52.298112] RSP: 0018:ffff881055967948 EFLAGS: 00010246
> [ 52.298112] RAX: 0000000000000000 RBX: ffff880852806e10 RCX: 0000000000000004
> [ 52.298112] RDX: 0000000000000000 RSI: ffff88085285b270 RDI: ffff880852806e10
> [ 52.298112] RBP: ffff881055967948 R08: 0000000000000000 R09: 000000002d748000
> [ 52.298112] R10: 00000000000003ff R11: 0000000000000001 R12: 0000000000001e1b
> [ 52.298112] R13: ffff88087f272c00 R14: 0000000000000001 R15: ffff88085605a000
> [ 52.298112] FS: 00007f33834d2740(0000) GS:ffff88087fb20000(0000) knlGS:0000000000000000
> [ 52.298112] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 52.298112] CR2: 00007f50589787b0 CR3: 00000008553c2000 CR4: 00000000001407e0
> [ 52.298112] Stack:
> [ 52.298112] ffff881055967960 ffffffffa02cd8a9 ffff880852806e10 ffff881055967998
> [ 52.298112] ffffffffa028daf6 0000000000000000 0000000000000000 0000000000000000
> [ 52.298112] 0000000000000000 ffff880852806e10 ffff8810559679e8 ffffffffa028eaac
> [ 52.298112] Call Trace:
> [ 52.298112] [<ffffffffa02cd8a9>] btrfs_tree_read_unlock_blocking+0xc9/0xd0 [btrfs]
> [ 52.298112] [<ffffffffa028daf6>] verify_parent_transid+0x146/0x1f0 [btrfs]
> [ 52.298112] [<ffffffffa028eaac>] btree_read_extent_buffer_pages.constprop.52+0xcc/0x120 [btrfs]
> [ 52.298112] [<ffffffffa02908f8>] read_tree_block+0x38/0x60 [btrfs]
> [ 52.298112] [<ffffffffa02912c6>] btrfs_read_tree_root+0x126/0x190 [btrfs]
> [ 52.298112] [<ffffffffa0291342>] btrfs_read_fs_root+0x12/0x50 [btrfs]
> [ 52.298112] [<ffffffff812b23ad>] ? radix_tree_lookup+0xd/0x10
> [ 52.298112] [<ffffffffa029140d>] btrfs_get_fs_root+0x8d/0x240 [btrfs]
> [ 52.298112] [<ffffffffa02f1b67>] __resolve_indirect_refs+0xf7/0x6f0 [btrfs]
> [ 52.298112] [<ffffffffa02cd5cc>] ? btrfs_tree_read_lock+0x4c/0xf0 [btrfs]
> [ 52.298112] [<ffffffffa02f21d6>] ? __add_missing_keys.isra.14+0x76/0x150 [btrfs]
> [ 52.298112] [<ffffffffa02f2d40>] find_parent_nodes+0x3c0/0x740 [btrfs]
> [ 52.298112] [<ffffffffa02f315a>] __btrfs_find_all_roots+0x9a/0x100 [btrfs]
> [ 52.298112] [<ffffffffa02b0d80>] ? set_state_bits+0x80/0x80 [btrfs]
> [ 52.298112] [<ffffffffa02f3b4d>] iterate_extent_inodes+0x12d/0x310 [btrfs]
> [ 52.298112] [<ffffffffa02b0d80>] ? set_state_bits+0x80/0x80 [btrfs]
> [ 52.298112] [<ffffffffa02f3db7>] iterate_inodes_from_logical+0x87/0xa0 [btrfs]
> [ 52.298112] [<ffffffffa02b89d2>] extent_fiemap+0x262/0x580 [btrfs]
> [ 52.298112] [<ffffffffa029be30>] ? btrfs_get_extent+0xaf0/0xaf0 [btrfs]
> [ 52.298112] [<ffffffffa0299935>] btrfs_fiemap+0x45/0x50 [btrfs]
> [ 52.298112] [<ffffffff811bd9b3>] do_vfs_ioctl+0x153/0x4b0
> [ 52.298112] [<ffffffff811c1444>] ? dput+0xb4/0x170
> [ 52.298112] [<ffffffff811bdd91>] SyS_ioctl+0x81/0xa0
> [ 52.298112] [<ffffffff815510e9>] system_call_fastpath+0x16/0x1b
> [ 52.298112] Code: bd d9 ff ff ff 48 89 df e8 91 7f f6 ff 44 89 e8 eb 05 b8 f4 ff ff ff 48 83 c4 30 5b 41 5c 41 5d 5d c3 0f 1f 44 00 00 55 48 89 e5 <0f> 0b 0f 1f 44 00 00 55 48 89 e5 0f 0b 55 be 7f 05 00 00 48 c7
> [ 52.298112] RIP [<ffffffffa0306fed>] btrfs_assert_tree_read_locked.part.0+0x9/0xb [btrfs]
> [ 52.298112] RSP <ffff881055967948>
> [ 53.230969] ---[ end trace 42ee398737c77c00 ]---
>
> The apparent culprit is:
>
> commit a26e8c9f75b0bfd8cccc9e8f110737b136eb5994
> Author: Josef Bacik <jbacik@xxxxxx>
> Date: Fri Mar 28 17:07:27 2014 -0400
>
> Btrfs: don't clear uptodate if the eb is under IO
>
> But I cannot say if this is in fact related to the issue I see in newer kernels.
>
> Ring any bells?
>
> Thanks,
> Davidlohr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/