Re: [btrfs] 3b54a0a703: WARNING:at_fs/btrfs/inode.c:#btrfs_finish_ordered_io[btrfs]

From: Qu Wenruo
Date: Wed Sep 16 2020 - 00:20:41 EST




On 2020/9/16 上午11:32, Oliver Sang wrote:
> On Tue, Sep 15, 2020 at 04:00:40PM +0800, Qu Wenruo wrote:
>>
>>
>> On 2020/9/15 下午3:40, Qu Wenruo wrote:
>>>
>>>
>>> On 2020/9/15 下午1:54, Oliver Sang wrote:
>>>> On Wed, Sep 09, 2020 at 03:49:30PM +0800, Qu Wenruo wrote:
>>>>>
>>>>>
>>>>> On 2020/9/9 下午3:08, kernel test robot wrote:
>>>>>> Greeting,
>>>>>>
>>>>>> FYI, we noticed the following commit (built with gcc-9):
>>>>>>
>>>>>> commit: 3b54a0a703f17d2b1317d24beefcdcca587a7667 ("[PATCH v3 3/5] btrfs: Detect unbalanced tree with empty leaf before crashing btree operations")
>>>>>> url: https://github.com/0day-ci/linux/commits/Qu-Wenruo/btrfs-Enhanced-runtime-defence-against-fuzzed-images/20200809-201720
>>>>>> base: https://git.kernel.org/cgit/linux/kernel/git/kdave/linux.git for-next
>>>>>>
>>>>>> in testcase: fio-basic
>>>>>> with following parameters:
>>>>>>
>>>>>> runtime: 300s
>>>>>> disk: 1SSD
>>>>>> fs: btrfs
>>>>>> nr_task: 100%
>>>>>> test_size: 128G
>>>>>> rw: write
>>>>>> bs: 4k
>>>>>> ioengine: sync
>>>>>> cpufreq_governor: performance
>>>>>> ucode: 0x400002c
>>>>>> fs2: nfsv4
>>>>>>
>>>>>> test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
>>>>>> test-url: https://github.com/axboe/fio
>>>>>>
>>>>>>
>>>>>> on test machine: 96 threads Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz with 128G memory
>>>>>>
>>>>>> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>>>>>>
>>>>>>
>>>>>> +----------------------------------------------------------------------------+------------+------------+
>>>>>> | | 2703206ff5 | 3b54a0a703 |
>>>>>> +----------------------------------------------------------------------------+------------+------------+
>>>>>> | boot_successes | 9 | 0 |
>>>>>> | boot_failures | 4 | |
>>>>>> | Kernel_panic-not_syncing:VFS:Unable_to_mount_root_fs_on_unknown-block(#,#) | 4 | |
>>>>>> +----------------------------------------------------------------------------+------------+------------+
>>>>>>
>>>>>>
>>>>>> If you fix the issue, kindly add following tag
>>>>>> Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
>>>>>>
>>>>>>
>>>>>
>>>>> According to the full dmesg, it's invalid nritems causing transaction abort.
>>>>>
>>>>> I'm not sure if it's caused by corrupts fs or something else.
>>>>>
>>>>> If intel guys can reproduce it reliably, would you please add such debug
>>>>> diff to output extra info?
>>>>
>>>> Hi Qu, sorry for late. we double confirmed the issue can be reproduced reliably.
>>>> The error will only occur on fbc but not parent commit.
>>>>
>>>> below from applying your path for extra info
>>>> [ 42.539443] [task_0]$
>>>> [ 42.539445]~$
>>>> [ 42.546125] rw=write$
>>>> [ 42.546126]~$
>>>> [ 42.551637] directory=/fs/nvme1n1p1$
>>>> [ 42.551638]~$
>>>> [ 42.559135] numjobs=96' | fio --output-format=json -$
>>>> [ 42.559136]~$
>>>> [ 42.574513] perf version 5.9.rc4.g34d4ddd359db$
>>>> [ 42.574518]~$
>>>> [ 56.152662] BTRFS error (device nvme1n1p1): invalid tree nritems, bytenr=13238272 owner=7 level=0 first_key=(18446744073709551606 128 96941895680) nritems=0
>>>> expect >0$
>>>
>>> Just as expected, this is indeed csum tree.
>>> And it looks like it's indeed still valid.
>>>
>>> The csum root can still have its key from previous not emptry csum.
>>
>> Wait for a minute, if it's csum root empty, we shouldn't have first_key
>> passed in.
>>
>> So this still has something wrong.
>>
>> Would you please try this diff to provide more debug info?
>> (Better to remove the existing diff)
>
> Hi Qu, please find the attached kmsg-debug-2.xz.
> also since I found two identical section in 3b54a0a703,
> so I applied below patch to both places.
> I also attached the final patch as 0001-debug-version-2.patch,
> for you to check if there's problem.

Oh, that's the problem.

The patch I submitted has a bug that the check should only be after that
generation check.

For live tree blocks (newer than last committed) tree blocks can be
empty temporary.

The upstream has already merged the correct fix 62fdaa52a3d0 ("btrfs:
Detect unbalanced tree with empty leaf before crashing btree operations").

Sorry for the false alert and inconvience.

Thanks,
Qu


>
>>
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index 75bbe879ed18..6f29a3c38b56 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -400,10 +400,17 @@ int btrfs_verify_level_key(struct extent_buffer
>> *eb, int level,
>>
>> /* We have @first_key, so this @eb must have at least one item */
>> if (btrfs_header_nritems(eb) == 0) {
>> + pr_info("%s: eb start=%llu gen=%llu last_committed=%llu\n",
>> + __func__, eb->start, btrfs_header_generation(eb),
>> + fs_info->last_trans_committed);
>> btrfs_err(fs_info,
>> "invalid tree nritems, bytenr=%llu nritems=0 expect >0",
>> eb->start);
>> - WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
>> + pr_info("%s: csum tree commit root:\n", __func__);
>> + btrfs_print_tree(fs_info->csum_root->commit_root, true);
>> + pr_info("%s: csum tree current root:\n", __func__);
>> + btrfs_print_tree(fs_info->csum_root->node, true);
>> + WARN_ON(1);
>> return -EUCLEAN;
>> }
>>
>>
>>>
>>> In that case, the check is indeed too strict and causes false alert.
>>>
>>> I'll soon send out a fix with Intel reported-by.
>>>
>>> Thanks,
>>> Qu
>>>
>>>> [ 56.152664] BTRFS error (device nvme1n1p1): invalid tree nritems, bytenr=13238272 owner=7 level=0 first_key=(18446744073709551606 128 96941895680) nritems=0
>>>> expect >0$
>>>> [ 56.152666] ------------[ cut here ]------------$
>>>> [ 56.168263] BTRFS: error (device nvme1n1p1) in btrfs_finish_ordered_io:2687: errno=-117 Filesystem corrupted$
>>>> [ 56.168264] BTRFS info (device nvme1n1p1): forced readonly$
>>>> [ 56.205009] BTRFS: Transaction aborted (error -117)$
>>>> [ 56.210368] WARNING: CPU: 71 PID: 537 at fs/btrfs/inode.c:2687 btrfs_finish_ordered_io+0x70a/0x820 [btrfs]$
>>>> [ 56.220466] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfsd auth_rpcgss dm_mod dax_pmem_compat nd_pmem device_dax nd_btt dax_pmem_core btrfs blak
>>>> e2b_generic sr_mod xor cdrom sd_mod zstd_decompress sg zstd_compress raid6_pq libcrc32c intel_rapl_msr intel_rapl_common skx_edac x86_pkg_temp_thermal intel_po
>>>> werclamp coretemp kvm_intel ipmi_ssif kvm irqbypass ast drm_vram_helper crct10dif_pclmul drm_ttm_helper crc32_pclmul ttm crc32c_intel ghash_clmulni_intel rapl
>>>> drm_kms_helper acpi_ipmi syscopyarea intel_cstate sysfillrect ahci sysimgblt nvme libahci fb_sys_fops intel_uncore mei_me nvme_core ioatdma t10_pi drm mei liba
>>>> ta joydev ipmi_si dca wmi ipmi_devintf ipmi_msghandler nfit libnvdimm ip_tables$
>>>> [ 56.285795] CPU: 71 PID: 537 Comm: kworker/u193:28 Not tainted 5.8.0-rc7-00166-g6e85ab8532a52 #1$
>>>> [ 56.295218] Workqueue: btrfs-endio-write btrfs_work_helper [btrfs]$
>>>> [ 56.302044] RIP: 0010:btrfs_finish_ordered_io+0x70a/0x820 [btrfs]$
>>>> [ 56.308762] Code: 48 0a 00 00 02 72 25 41 83 ff fb 0f 84 f2 00 00 00 41 83 ff e2 0f 84 e8 00 00 00 44 89 fe 48 c7 c7 a0 9c 2c c1 e8 58 2e ec bf <0f> 0b 44 8
>>>> 9 f9 ba 7f 0a 00 00 48 c7 c6 50 c7 2b c1 48 89 df e8 15$
>>>> [ 56.328846] RSP: 0018:ffffc90007babd58 EFLAGS: 00010282$
>>>> [ 56.334755] RAX: 0000000000000000 RBX: ffff888fb85984e0 RCX: 0000000000000000$
>>>> [ 56.342577] RDX: ffff8890401e82a0 RSI: ffff8890401d7f60 RDI: ffff8890401d7f60$
>>>> [ 56.350415] RBP: ffff889007e2e4c0 R08: 0000000001200000 R09: 0000000000000000$
>>>> [ 56.358255] R10: 0000000000000001 R11: ffffffffc00bd060 R12: 0000000010e7e000$
>>>> [ 56.366121] R13: 0000000010e7efff R14: ffff888fb86622a8 R15: 00000000ffffff8b$
>>>> [ 56.373983] FS: 0000000000000000(0000) GS:ffff8890401c0000(0000) knlGS:0000000000000000$
>>>> [ 56.382806] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033$
>>>> [ 56.389291] CR2: 00007fa44c4cc448 CR3: 0000005eac7c0006 CR4: 00000000007606e0$
>>>> [ 56.397176] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000$
>>>> [ 56.405065] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400$
>>>> [ 56.412952] PKRU: 55555554$
>>>> [ 56.416419] Call Trace:$
>>>> [ 56.419631] ? update_curr+0xc0/0x200$
>>>> [ 56.424060] ? newidle_balance+0x232/0x3e0$
>>>> [ 56.428958] btrfs_work_helper+0xc9/0x400 [btrfs]$
>>>> [ 56.434441] ? __schedule+0x378/0x860$
>>>> [ 56.438895] process_one_work+0x1b5/0x3a0$
>>>> [ 56.443690] worker_thread+0x50/0x3c0$
>>>> [ 56.446698] /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/
>>>> lkp-csl-2sp3/fio-basic-4k-performance-1SSD-btrfs-nfsv4-sync-100%25-300s-write-128G-ucode=0x400002c-monitor=a122c70f-debian-10.4-x86_64-20200603-20200915-84129-
>>>> 1i8kzyy-0.yaml&job_state=post_run -O /dev/null$
>>>> [ 56.446700]~$
>>>> [ 56.448144] ? process_one_work+0x3a0/0x3a0$
>>>> [ 56.491886] kthread+0x114/0x160$
>>>> [ 56.495986] ? kthread_park+0xa0/0xa0$
>>>> [ 56.500522] ret_from_fork+0x1f/0x30$
>>>> [ 56.504966] ---[ end trace fbe43e164e851f97 ]---$
>>>> [ 56.510432] BTRFS: error (device nvme1n1p1) in btrfs_finish_ordered_io:2687: errno=-117 Filesystem corrupted$
>>>>
>>>>
>>>> I also attached full dmesg 'dmesg-with-debug' for your reference.
>>>>
>>>>>
>>>>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>>>>> index b1a148058773..b050d6fcb90a 100644
>>>>> --- a/fs/btrfs/disk-io.c
>>>>> +++ b/fs/btrfs/disk-io.c
>>>>> @@ -406,8 +406,9 @@ int btrfs_verify_level_key(struct extent_buffer *eb,
>>>>> int level,
>>>>> /* We have @first_key, so this @eb must have at least one item */
>>>>> if (btrfs_header_nritems(eb) == 0) {
>>>>> btrfs_err(fs_info,
>>>>> - "invalid tree nritems, bytenr=%llu nritems=0 expect >0",
>>>>> - eb->start);
>>>>> + "invalid tree nritems, bytenr=%llu owner=%llu level=%d
>>>>> first_key=(%llu %u %llu) nritems=0 expect >0",
>>>>> + eb->start, btrfs_header_owner(eb),
>>>>> btrfs_header_level(eb),
>>>>> + first_key->objectid, first_key->type,
>>>>> first_key->offset);
>>>>> WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
>>>>> return -EUCLEAN;
>>>>> }
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>>> [ 50.226906] WARNING: CPU: 71 PID: 500 at fs/btrfs/inode.c:2687 btrfs_finish_ordered_io+0x70a/0x820 [btrfs]
>>>>>> [ 50.236913] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfsd auth_rpcgss dm_mod dax_pmem_compat nd_pmem device_dax nd_btt dax_pmem_core btrfs sr_mod blake2b_generic xor cdrom sd_mod zstd_decompress sg zstd_compress raid6_pq libcrc32c intel_rapl_msr intel_rapl_common skx_edac x86_pkg_temp_thermal ipmi_ssif intel_powerclamp coretemp kvm_intel kvm irqbypass ast crct10dif_pclmul drm_vram_helper crc32_pclmul crc32c_intel acpi_ipmi drm_ttm_helper ghash_clmulni_intel ttm rapl drm_kms_helper intel_cstate syscopyarea sysfillrect nvme sysimgblt intel_uncore fb_sys_fops nvme_core ahci libahci t10_pi drm mei_me ioatdma libata mei ipmi_si joydev dca wmi ipmi_devintf ipmi_msghandler nfit libnvdimm ip_tables
>>>>>> [ 50.301669] CPU: 71 PID: 500 Comm: kworker/u193:5 Not tainted 5.8.0-rc7-00165-g3b54a0a703f17 #1
>>>>>> [ 50.310904] Workqueue: btrfs-endio-write btrfs_work_helper [btrfs]
>>>>>> [ 50.317626] RIP: 0010:btrfs_finish_ordered_io+0x70a/0x820 [btrfs]
>>>>>> [ 50.324255] Code: 48 0a 00 00 02 72 25 41 83 ff fb 0f 84 f2 00 00 00 41 83 ff e2 0f 84 e8 00 00 00 44 89 fe 48 c7 c7 70 1c 2b c1 e8 58 ae ed bf <0f> 0b 44 89 f9 ba 7f 0a 00 00 48 c7 c6 50 47 2a c1 48 89 df e8 15
>>>>>> [ 50.344116] RSP: 0018:ffffc90007a83d58 EFLAGS: 00010282
>>>>>> [ 50.349923] RAX: 0000000000000000 RBX: ffff888a93ca5ea0 RCX: 0000000000000000
>>>>>> [ 50.357656] RDX: ffff8890401e82a0 RSI: ffff8890401d7f60 RDI: ffff8890401d7f60
>>>>>> [ 50.365385] RBP: ffff8890300ab8a8 R08: 0000000000000bd4 R09: 0000000000000000
>>>>>> [ 50.373133] R10: 0000000000000001 R11: ffffffffc0714060 R12: 000000000f83e000
>>>>>> [ 50.381060] R13: 000000000f83ffff R14: ffff888fb6c39968 R15: 00000000ffffff8b
>>>>>> [ 50.388824] FS: 0000000000000000(0000) GS:ffff8890401c0000(0000) knlGS:0000000000000000
>>>>>> [ 50.397545] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [ 50.404300] CR2: 00007feacc500f98 CR3: 0000000f74422006 CR4: 00000000007606e0
>>>>>> [ 50.412477] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>>> [ 50.420281] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>>>> [ 50.428082] PKRU: 55555554
>>>>>> [ 50.431451] Call Trace:
>>>>>> [ 50.434570] ? update_curr+0xc0/0x200
>>>>>> [ 50.438919] ? newidle_balance+0x232/0x3e0
>>>>>> [ 50.443700] ? __wake_up_common+0x80/0x180
>>>>>> [ 50.448491] btrfs_work_helper+0xc9/0x400 [btrfs]
>>>>>> [ 50.453880] ? __schedule+0x378/0x860
>>>>>> [ 50.458218] process_one_work+0x1b5/0x3a0
>>>>>> [ 50.462917] worker_thread+0x50/0x3c0
>>>>>> [ 50.467262] ? process_one_work+0x3a0/0x3a0
>>>>>> [ 50.472148] kthread+0x114/0x160
>>>>>> [ 50.476084] ? kthread_park+0xa0/0xa0
>>>>>> [ 50.480445] ret_from_fork+0x1f/0x30
>>>>>> [ 50.484731] ---[ end trace cc096c1a2068030e ]---
>>>>>>
>>>>>>
>>>>>> To reproduce:
>>>>>>
>>>>>> git clone https://github.com/intel/lkp-tests.git
>>>>>> cd lkp-tests
>>>>>> bin/lkp install job.yaml # job file is attached in this email
>>>>>> bin/lkp run job.yaml
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Oliver Sang
>>>>>>
>>>>>
>>