Re: [LKP] [lkp-robot] [brd] 316ba5736c: aim7.jobs-per-min -11.2% regression

From: Chris Mason
Date: Wed Dec 19 2018 - 14:58:44 EST


On 18 Dec 2018, at 13:57, Jens Axboe wrote:

> On 12/18/18 2:11 AM, kemi wrote:
>> Hi, All
>> Do we have special reason to keep this patch (316ba5736c9:brd: Mark
>> as non-rotational).
>> which leads to a performance regression when BRD is used as a disk on
>> btrfs.
>
> I really suspect that this is a btrfs issue, as this is just flagging
> what is pretty obvious, that a ramdisk is NOT a rotational drive.
> So whatever btrfs is doing with that information is causing it to
> run slower - this really doesn't make any sense, but there we are.
>
> CC'ing Chris, leaving the report below.

Btrfs is changing the allocator decisions slightly for an SSD,
especially the cluster size for metadata, which should show up as more
system time spent in the btrfs allocator, but I'm not seeing that below.
It also changes how quickly btrfs dispatches synchronous IO.

But, some parts of the differential don't quite make sense to me:

>>>> 47.50 Â 58% +1355.8% 691.50 Â 92% meminfo.Mlocked

Are these changes expected?

-chris

>
>> On 2018/7/10 äå1:27, kemi wrote:
>>> Hi, SeongJae
>>> Do you have any input for this regression? thanks
>>>
>>> On 2018å06æ04æ 13:52, kernel test robot wrote:
>>>>
>>>> Greeting,
>>>>
>>>> FYI, we noticed a -11.2% regression of aim7.jobs-per-min due to
>>>> commit:
>>>>
>>>>
>>>> commit: 316ba5736c9caa5dbcd84085989862d2df57431d ("brd: Mark as
>>>> non-rotational")
>>>> https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git
>>>> for-4.18/block
>>>>
>>>> in testcase: aim7
>>>> on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @
>>>> 3.00GHz with 384G memory
>>>> with following parameters:
>>>>
>>>> disk: 1BRD_48G
>>>> fs: btrfs
>>>> test: disk_rw
>>>> load: 1500
>>>> cpufreq_governor: performance
>>>>
>>>> test-description: AIM7 is a traditional UNIX system level benchmark
>>>> suite which is used to test and measure the performance of
>>>> multiuser system.
>>>> test-url:
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__sourceforge.net_projects_aimbench_files_aim-2Dsuite7_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=9QPtTAxcitoznaWRKKHoEQ&m=kkEXHhn9ofFgUoBrBpTiepWkkQeot8EjTaMlN_yKeyw&s=ScajB-GPDPZvGMy0XU1Hbatu9gVLkqk2j8MSCzK0S8E&e=
>>>>
>>>>
>>>>
>>>> Details are as below:
>>>> -------------------------------------------------------------------------------------------------->
>>>>
>>>> =========================================================================================
>>>> compiler/cpufreq_governor/disk/fs/kconfig/load/rootfs/tbox_group/test/testcase:
>>>> gcc-7/performance/1BRD_48G/btrfs/x86_64-rhel-7.2/1500/debian-x86_64-2016-08-31.cgz/lkp-ivb-ep01/disk_rw/aim7
>>>>
>>>> commit:
>>>> 522a777566 ("block: consolidate struct request timestamp fields")
>>>> 316ba5736c ("brd: Mark as non-rotational")
>>>>
>>>> 522a777566f56696 316ba5736c9caa5dbcd8408598
>>>> ---------------- --------------------------
>>>> %stddev %change %stddev
>>>> \ | \
>>>> 28321 -11.2% 25147 aim7.jobs-per-min
>>>> 318.19 +12.6% 358.23
>>>> aim7.time.elapsed_time
>>>> 318.19 +12.6% 358.23
>>>> aim7.time.elapsed_time.max
>>>> 1437526 Â 2% +14.6% 1646849 Â 2%
>>>> aim7.time.involuntary_context_switches
>>>> 11986 +14.2% 13691 aim7.time.system_time
>>>> 73.06 Â 2% -3.6% 70.43 aim7.time.user_time
>>>> 2449470 Â 2% -25.0% 1837521 Â 4%
>>>> aim7.time.voluntary_context_switches
>>>> 20.25 Â 58% +1681.5% 360.75 Â109%
>>>> numa-meminfo.node1.Mlocked
>>>> 456062 -16.3% 381859 softirqs.SCHED
>>>> 9015 Â 7% -21.3% 7098 Â 22% meminfo.CmaFree
>>>> 47.50 Â 58% +1355.8% 691.50 Â 92% meminfo.Mlocked
>>>> 5.24 Â 3% -1.2 3.99 Â 2% mpstat.cpu.idle%
>>>> 0.61 Â 2% -0.1 0.52 Â 2% mpstat.cpu.usr%
>>>> 16627 +12.8% 18762 Â 4%
>>>> slabinfo.Acpi-State.active_objs
>>>> 16627 +12.9% 18775 Â 4%
>>>> slabinfo.Acpi-State.num_objs
>>>> 57.00 Â 2% +17.5% 67.00 vmstat.procs.r
>>>> 20936 -24.8% 15752 Â 2% vmstat.system.cs
>>>> 45474 -1.7% 44681 vmstat.system.in
>>>> 6.50 Â 59% +1157.7% 81.75 Â 75%
>>>> numa-vmstat.node0.nr_mlock
>>>> 242870 Â 3% +13.2% 274913 Â 7%
>>>> numa-vmstat.node0.nr_written
>>>> 2278 Â 7% -22.6% 1763 Â 21%
>>>> numa-vmstat.node1.nr_free_cma
>>>> 4.75 Â 58% +1789.5% 89.75 Â109%
>>>> numa-vmstat.node1.nr_mlock
>>>> 88018135 Â 3% -48.9% 44980457 Â 7% cpuidle.C1.time
>>>> 1398288 Â 3% -51.1% 683493 Â 9% cpuidle.C1.usage
>>>> 3499814 Â 2% -38.5% 2153158 Â 5% cpuidle.C1E.time
>>>> 52722 Â 4% -45.6% 28692 Â 6% cpuidle.C1E.usage
>>>> 9865857 Â 3% -40.1% 5905155 Â 5% cpuidle.C3.time
>>>> 69656 Â 2% -42.6% 39990 Â 5% cpuidle.C3.usage
>>>> 590856 Â 2% -12.3% 517910 cpuidle.C6.usage
>>>> 46160 Â 7% -53.7% 21372 Â 11% cpuidle.POLL.time
>>>> 1716 Â 7% -46.6% 916.25 Â 14% cpuidle.POLL.usage
>>>> 197656 +4.1% 205732
>>>> proc-vmstat.nr_active_file
>>>> 191867 +4.1% 199647 proc-vmstat.nr_dirty
>>>> 509282 +1.6% 517318
>>>> proc-vmstat.nr_file_pages
>>>> 2282 Â 8% -24.4% 1725 Â 22%
>>>> proc-vmstat.nr_free_cma
>>>> 357.50 +10.6% 395.25 Â 2%
>>>> proc-vmstat.nr_inactive_file
>>>> 11.50 Â 58% +1397.8% 172.25 Â 93%
>>>> proc-vmstat.nr_mlock
>>>> 970355 Â 4% +14.6% 1111549 Â 8%
>>>> proc-vmstat.nr_written
>>>> 197984 +4.1% 206034
>>>> proc-vmstat.nr_zone_active_file
>>>> 357.50 +10.6% 395.25 Â 2%
>>>> proc-vmstat.nr_zone_inactive_file
>>>> 192282 +4.1% 200126
>>>> proc-vmstat.nr_zone_write_pending
>>>> 7901465 Â 3% -14.0% 6795016 Â 16%
>>>> proc-vmstat.pgalloc_movable
>>>> 886101 +10.2% 976329 proc-vmstat.pgfault
>>>> 2.169e+12 +15.2% 2.497e+12
>>>> perf-stat.branch-instructions
>>>> 0.41 -0.1 0.35
>>>> perf-stat.branch-miss-rate%
>>>> 31.19 Â 2% +1.6 32.82
>>>> perf-stat.cache-miss-rate%
>>>> 9.116e+09 +8.3% 9.869e+09
>>>> perf-stat.cache-misses
>>>> 2.924e+10 +2.9% 3.008e+10 Â 2%
>>>> perf-stat.cache-references
>>>> 6712739 Â 2% -15.4% 5678643 Â 2%
>>>> perf-stat.context-switches
>>>> 4.02 +2.7% 4.13 perf-stat.cpi
>>>> 3.761e+13 +17.3% 4.413e+13 perf-stat.cpu-cycles
>>>> 606958 -13.7% 523758 Â 2%
>>>> perf-stat.cpu-migrations
>>>> 2.476e+12 +13.4% 2.809e+12 perf-stat.dTLB-loads
>>>> 0.18 Â 2% -0.0 0.16 Â 9%
>>>> perf-stat.dTLB-store-miss-rate%
>>>> 1.079e+09 Â 2% -9.6% 9.755e+08 Â 9%
>>>> perf-stat.dTLB-store-misses
>>>> 5.933e+11 +1.6% 6.029e+11 perf-stat.dTLB-stores
>>>> 9.349e+12 +14.2% 1.068e+13
>>>> perf-stat.instructions
>>>> 11247 Â 11% +19.8% 13477 Â 9%
>>>> perf-stat.instructions-per-iTLB-miss
>>>> 0.25 -2.6% 0.24 perf-stat.ipc
>>>> 865561 +10.3% 954350
>>>> perf-stat.minor-faults
>>>> 2.901e+09 Â 3% +9.8% 3.186e+09 Â 3%
>>>> perf-stat.node-load-misses
>>>> 3.682e+09 Â 3% +11.0% 4.088e+09 Â 3%
>>>> perf-stat.node-loads
>>>> 3.778e+09 +4.8% 3.959e+09 Â 2%
>>>> perf-stat.node-store-misses
>>>> 5.079e+09 +6.4% 5.402e+09 perf-stat.node-stores
>>>> 865565 +10.3% 954352 perf-stat.page-faults
>>>> 51.75 Â 5% -12.5% 45.30 Â 10%
>>>> sched_debug.cfs_rq:/.load_avg.avg
>>>> 316.35 Â 3% +17.2% 370.81 Â 8%
>>>> sched_debug.cfs_rq:/.util_est_enqueued.stddev
>>>> 15294 Â 30% +234.9% 51219 Â 76%
>>>> sched_debug.cpu.avg_idle.min
>>>> 299443 Â 3% -7.3% 277566 Â 5%
>>>> sched_debug.cpu.avg_idle.stddev
>>>> 1182 Â 19% -26.3% 872.02 Â 13%
>>>> sched_debug.cpu.nr_load_updates.stddev
>>>> 1.22 Â 8% +21.7% 1.48 Â 6%
>>>> sched_debug.cpu.nr_running.avg
>>>> 2.75 Â 10% +26.2% 3.47 Â 6%
>>>> sched_debug.cpu.nr_running.max
>>>> 0.58 Â 7% +24.2% 0.73 Â 6%
>>>> sched_debug.cpu.nr_running.stddev
>>>> 77148 -20.0% 61702 Â 7%
>>>> sched_debug.cpu.nr_switches.avg
>>>> 70024 -24.8% 52647 Â 8%
>>>> sched_debug.cpu.nr_switches.min
>>>> 6662 Â 6% +61.9% 10789 Â 24%
>>>> sched_debug.cpu.nr_switches.stddev
>>>> 80.45 Â 18% -19.1% 65.05 Â 6%
>>>> sched_debug.cpu.nr_uninterruptible.stddev
>>>> 76819 -19.3% 62008 Â 8%
>>>> sched_debug.cpu.sched_count.avg
>>>> 70616 -23.5% 53996 Â 8%
>>>> sched_debug.cpu.sched_count.min
>>>> 5494 Â 9% +85.3% 10179 Â 26%
>>>> sched_debug.cpu.sched_count.stddev
>>>> 16936 -52.9% 7975 Â 9%
>>>> sched_debug.cpu.sched_goidle.avg
>>>> 19281 -49.9% 9666 Â 7%
>>>> sched_debug.cpu.sched_goidle.max
>>>> 15417 -54.8% 6962 Â 10%
>>>> sched_debug.cpu.sched_goidle.min
>>>> 875.00 Â 6% -35.0% 569.09 Â 13%
>>>> sched_debug.cpu.sched_goidle.stddev
>>>> 40332 -23.5% 30851 Â 7%
>>>> sched_debug.cpu.ttwu_count.avg
>>>> 35074 -26.3% 25833 Â 6%
>>>> sched_debug.cpu.ttwu_count.min
>>>> 3239 Â 8% +67.4% 5422 Â 28%
>>>> sched_debug.cpu.ttwu_count.stddev
>>>> 5232 +27.4% 6665 Â 13%
>>>> sched_debug.cpu.ttwu_local.avg
>>>> 15877 Â 12% +77.5% 28184 Â 27%
>>>> sched_debug.cpu.ttwu_local.max
>>>> 2530 Â 10% +95.9% 4956 Â 27%
>>>> sched_debug.cpu.ttwu_local.stddev
>>>> 2.52 Â 7% -0.6 1.95 Â 3%
>>>> perf-profile.calltrace.cycles-pp.btrfs_dirty_pages.__btrfs_buffered_write.btrfs_file_write_iter.__vfs_write.vfs_write
>>>> 1.48 Â 12% -0.5 1.01 Â 4%
>>>> perf-profile.calltrace.cycles-pp.btrfs_get_extent.btrfs_dirty_pages.__btrfs_buffered_write.btrfs_file_write_iter.__vfs_write
>>>> 1.18 Â 16% -0.4 0.76 Â 7%
>>>> perf-profile.calltrace.cycles-pp.btrfs_search_slot.btrfs_lookup_file_extent.btrfs_get_extent.btrfs_dirty_pages.__btrfs_buffered_write
>>>> 1.18 Â 16% -0.4 0.76 Â 7%
>>>> perf-profile.calltrace.cycles-pp.btrfs_lookup_file_extent.btrfs_get_extent.btrfs_dirty_pages.__btrfs_buffered_write.btrfs_file_write_iter
>>>> 0.90 Â 17% -0.3 0.56 Â 4%
>>>> perf-profile.calltrace.cycles-pp.__dentry_kill.dentry_kill.dput.__fput.task_work_run
>>>> 0.90 Â 17% -0.3 0.56 Â 4%
>>>> perf-profile.calltrace.cycles-pp.evict.__dentry_kill.dentry_kill.dput.__fput
>>>> 0.90 Â 17% -0.3 0.56 Â 4%
>>>> perf-profile.calltrace.cycles-pp.dentry_kill.dput.__fput.task_work_run.exit_to_usermode_loop
>>>> 0.90 Â 18% -0.3 0.56 Â 4%
>>>> perf-profile.calltrace.cycles-pp.btrfs_evict_inode.evict.__dentry_kill.dentry_kill.dput
>>>> 0.90 Â 17% -0.3 0.57 Â 5%
>>>> perf-profile.calltrace.cycles-pp.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>> 0.90 Â 17% -0.3 0.57 Â 5%
>>>> perf-profile.calltrace.cycles-pp.task_work_run.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>> 0.90 Â 17% -0.3 0.57 Â 5%
>>>> perf-profile.calltrace.cycles-pp.__fput.task_work_run.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>> 0.90 Â 17% -0.3 0.57 Â 5%
>>>> perf-profile.calltrace.cycles-pp.dput.__fput.task_work_run.exit_to_usermode_loop.do_syscall_64
>>>> 1.69 -0.1 1.54 Â 2%
>>>> perf-profile.calltrace.cycles-pp.lock_and_cleanup_extent_if_need.__btrfs_buffered_write.btrfs_file_write_iter.__vfs_write.vfs_write
>>>> 0.87 Â 4% -0.1 0.76 Â 2%
>>>> perf-profile.calltrace.cycles-pp.__clear_extent_bit.clear_extent_bit.lock_and_cleanup_extent_if_need.__btrfs_buffered_write.btrfs_file_write_iter
>>>> 0.87 Â 4% -0.1 0.76 Â 2%
>>>> perf-profile.calltrace.cycles-pp.clear_extent_bit.lock_and_cleanup_extent_if_need.__btrfs_buffered_write.btrfs_file_write_iter.__vfs_write
>>>> 0.71 Â 6% -0.1 0.61 Â 2%
>>>> perf-profile.calltrace.cycles-pp.clear_state_bit.__clear_extent_bit.clear_extent_bit.lock_and_cleanup_extent_if_need.__btrfs_buffered_write
>>>> 0.69 Â 6% -0.1 0.60 Â 2%
>>>> perf-profile.calltrace.cycles-pp.btrfs_clear_bit_hook.clear_state_bit.__clear_extent_bit.clear_extent_bit.lock_and_cleanup_extent_if_need
>>>> 96.77 +0.6 97.33
>>>> perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
>>>> 0.00 +0.6 0.56 Â 3%
>>>> perf-profile.calltrace.cycles-pp.can_overcommit.reserve_metadata_bytes.btrfs_delalloc_reserve_metadata.__btrfs_buffered_write.btrfs_file_write_iter
>>>> 96.72 +0.6 97.29
>>>> perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>> 43.13 +0.8 43.91
>>>> perf-profile.calltrace.cycles-pp.btrfs_inode_rsv_release.__btrfs_buffered_write.btrfs_file_write_iter.__vfs_write.vfs_write
>>>> 42.37 +0.8 43.16
>>>> perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.block_rsv_release_bytes.btrfs_inode_rsv_release.__btrfs_buffered_write
>>>> 43.11 +0.8 43.89
>>>> perf-profile.calltrace.cycles-pp.block_rsv_release_bytes.btrfs_inode_rsv_release.__btrfs_buffered_write.btrfs_file_write_iter.__vfs_write
>>>> 42.96 +0.8 43.77
>>>> perf-profile.calltrace.cycles-pp._raw_spin_lock.block_rsv_release_bytes.btrfs_inode_rsv_release.__btrfs_buffered_write.btrfs_file_write_iter
>>>> 95.28 +0.9 96.23
>>>> perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>> 95.22 +1.0 96.18
>>>> perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>> 94.88 +1.0 95.85
>>>> perf-profile.calltrace.cycles-pp.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>>>> 94.83 +1.0 95.80
>>>> perf-profile.calltrace.cycles-pp.btrfs_file_write_iter.__vfs_write.vfs_write.ksys_write.do_syscall_64
>>>> 94.51 +1.0 95.50
>>>> perf-profile.calltrace.cycles-pp.__btrfs_buffered_write.btrfs_file_write_iter.__vfs_write.vfs_write.ksys_write
>>>> 42.44 +1.1 43.52
>>>> perf-profile.calltrace.cycles-pp._raw_spin_lock.reserve_metadata_bytes.btrfs_delalloc_reserve_metadata.__btrfs_buffered_write.btrfs_file_write_iter
>>>> 42.09 +1.1 43.18
>>>> perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.reserve_metadata_bytes.btrfs_delalloc_reserve_metadata.__btrfs_buffered_write
>>>> 44.07 +1.2 45.29
>>>> perf-profile.calltrace.cycles-pp.btrfs_delalloc_reserve_metadata.__btrfs_buffered_write.btrfs_file_write_iter.__vfs_write.vfs_write
>>>> 43.42 +1.3 44.69
>>>> perf-profile.calltrace.cycles-pp.reserve_metadata_bytes.btrfs_delalloc_reserve_metadata.__btrfs_buffered_write.btrfs_file_write_iter.__vfs_write
>>>> 2.06 Â 18% -0.9 1.21 Â 6%
>>>> perf-profile.children.cycles-pp.btrfs_search_slot
>>>> 2.54 Â 7% -0.6 1.96 Â 3%
>>>> perf-profile.children.cycles-pp.btrfs_dirty_pages
>>>> 1.05 Â 24% -0.5 0.52 Â 9%
>>>> perf-profile.children.cycles-pp._raw_spin_lock_irqsave
>>>> 1.50 Â 12% -0.5 1.03 Â 4%
>>>> perf-profile.children.cycles-pp.btrfs_get_extent
>>>> 1.22 Â 15% -0.4 0.79 Â 8%
>>>> perf-profile.children.cycles-pp.btrfs_lookup_file_extent
>>>> 0.81 Â 5% -0.4 0.41 Â 6%
>>>> perf-profile.children.cycles-pp.btrfs_calc_reclaim_metadata_size
>>>> 0.74 Â 24% -0.4 0.35 Â 9%
>>>> perf-profile.children.cycles-pp.btrfs_lock_root_node
>>>> 0.74 Â 24% -0.4 0.35 Â 9%
>>>> perf-profile.children.cycles-pp.btrfs_tree_lock
>>>> 0.90 Â 17% -0.3 0.56 Â 4%
>>>> perf-profile.children.cycles-pp.__dentry_kill
>>>> 0.90 Â 17% -0.3 0.56 Â 4%
>>>> perf-profile.children.cycles-pp.evict
>>>> 0.90 Â 17% -0.3 0.56 Â 4%
>>>> perf-profile.children.cycles-pp.dentry_kill
>>>> 0.90 Â 18% -0.3 0.56 Â 4%
>>>> perf-profile.children.cycles-pp.btrfs_evict_inode
>>>> 0.91 Â 18% -0.3 0.57 Â 4%
>>>> perf-profile.children.cycles-pp.exit_to_usermode_loop
>>>> 0.52 Â 20% -0.3 0.18 Â 14%
>>>> perf-profile.children.cycles-pp.do_idle
>>>> 0.90 Â 17% -0.3 0.57 Â 5%
>>>> perf-profile.children.cycles-pp.task_work_run
>>>> 0.90 Â 17% -0.3 0.57 Â 5%
>>>> perf-profile.children.cycles-pp.__fput
>>>> 0.90 Â 18% -0.3 0.57 Â 4%
>>>> perf-profile.children.cycles-pp.dput
>>>> 0.51 Â 20% -0.3 0.18 Â 14%
>>>> perf-profile.children.cycles-pp.secondary_startup_64
>>>> 0.51 Â 20% -0.3 0.18 Â 14%
>>>> perf-profile.children.cycles-pp.cpu_startup_entry
>>>> 0.50 Â 21% -0.3 0.17 Â 16%
>>>> perf-profile.children.cycles-pp.start_secondary
>>>> 0.47 Â 20% -0.3 0.16 Â 13%
>>>> perf-profile.children.cycles-pp.cpuidle_enter_state
>>>> 0.47 Â 19% -0.3 0.16 Â 13%
>>>> perf-profile.children.cycles-pp.intel_idle
>>>> 0.61 Â 20% -0.3 0.36 Â 11%
>>>> perf-profile.children.cycles-pp.btrfs_tree_read_lock
>>>> 0.47 Â 26% -0.3 0.21 Â 10%
>>>> perf-profile.children.cycles-pp.prepare_to_wait_event
>>>> 0.64 Â 18% -0.2 0.39 Â 9%
>>>> perf-profile.children.cycles-pp.btrfs_read_lock_root_node
>>>> 0.40 Â 22% -0.2 0.21 Â 5%
>>>> perf-profile.children.cycles-pp.btrfs_clear_path_blocking
>>>> 0.38 Â 23% -0.2 0.19 Â 13%
>>>> perf-profile.children.cycles-pp.finish_wait
>>>> 1.51 Â 3% -0.2 1.35 Â 2%
>>>> perf-profile.children.cycles-pp.__clear_extent_bit
>>>> 1.71 -0.1 1.56 Â 2%
>>>> perf-profile.children.cycles-pp.lock_and_cleanup_extent_if_need
>>>> 0.29 Â 25% -0.1 0.15 Â 10%
>>>> perf-profile.children.cycles-pp.btrfs_orphan_del
>>>> 0.27 Â 27% -0.1 0.12 Â 8%
>>>> perf-profile.children.cycles-pp.btrfs_del_orphan_item
>>>> 0.33 Â 18% -0.1 0.19 Â 9%
>>>> perf-profile.children.cycles-pp.queued_read_lock_slowpath
>>>> 0.33 Â 19% -0.1 0.20 Â 4%
>>>> perf-profile.children.cycles-pp.__wake_up_common_lock
>>>> 0.45 Â 15% -0.1 0.34 Â 2%
>>>> perf-profile.children.cycles-pp.btrfs_alloc_data_chunk_ondemand
>>>> 0.47 Â 16% -0.1 0.36 Â 4%
>>>> perf-profile.children.cycles-pp.btrfs_check_data_free_space
>>>> 0.91 Â 4% -0.1 0.81 Â 3%
>>>> perf-profile.children.cycles-pp.clear_extent_bit
>>>> 1.07 Â 5% -0.1 0.97
>>>> perf-profile.children.cycles-pp.__set_extent_bit
>>>> 0.77 Â 6% -0.1 0.69 Â 3%
>>>> perf-profile.children.cycles-pp.btrfs_clear_bit_hook
>>>> 0.17 Â 20% -0.1 0.08 Â 10%
>>>> perf-profile.children.cycles-pp.queued_write_lock_slowpath
>>>> 0.16 Â 22% -0.1 0.08 Â 24%
>>>> perf-profile.children.cycles-pp.btrfs_lookup_inode
>>>> 0.21 Â 17% -0.1 0.14 Â 19%
>>>> perf-profile.children.cycles-pp.__btrfs_update_delayed_inode
>>>> 0.26 Â 12% -0.1 0.18 Â 13%
>>>> perf-profile.children.cycles-pp.btrfs_async_run_delayed_root
>>>> 0.52 Â 5% -0.1 0.45
>>>> perf-profile.children.cycles-pp.set_extent_bit
>>>> 0.45 Â 5% -0.1 0.40 Â 3%
>>>> perf-profile.children.cycles-pp.alloc_extent_state
>>>> 0.11 Â 17% -0.1 0.06 Â 11%
>>>> perf-profile.children.cycles-pp.btrfs_clear_lock_blocking_rw
>>>> 0.28 Â 9% -0.0 0.23 Â 3%
>>>> perf-profile.children.cycles-pp.btrfs_drop_pages
>>>> 0.07 -0.0 0.03 Â100%
>>>> perf-profile.children.cycles-pp.btrfs_set_lock_blocking_rw
>>>> 0.39 Â 3% -0.0 0.34 Â 3%
>>>> perf-profile.children.cycles-pp.get_alloc_profile
>>>> 0.33 Â 7% -0.0 0.29
>>>> perf-profile.children.cycles-pp.btrfs_set_extent_delalloc
>>>> 0.38 Â 2% -0.0 0.35 Â 4%
>>>> perf-profile.children.cycles-pp.__set_page_dirty_nobuffers
>>>> 0.49 Â 3% -0.0 0.46 Â 3%
>>>> perf-profile.children.cycles-pp.pagecache_get_page
>>>> 0.18 Â 4% -0.0 0.15 Â 2%
>>>> perf-profile.children.cycles-pp.truncate_inode_pages_range
>>>> 0.08 Â 5% -0.0 0.05 Â 9%
>>>> perf-profile.children.cycles-pp.btrfs_set_path_blocking
>>>> 0.08 Â 6% -0.0 0.06 Â 6%
>>>> perf-profile.children.cycles-pp.truncate_cleanup_page
>>>> 0.80 Â 4% +0.2 0.95 Â 2%
>>>> perf-profile.children.cycles-pp.can_overcommit
>>>> 96.84 +0.5 97.37
>>>> perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>>>> 96.80 +0.5 97.35
>>>> perf-profile.children.cycles-pp.do_syscall_64
>>>> 43.34 +0.8 44.17
>>>> perf-profile.children.cycles-pp.btrfs_inode_rsv_release
>>>> 43.49 +0.8 44.32
>>>> perf-profile.children.cycles-pp.block_rsv_release_bytes
>>>> 95.32 +0.9 96.26
>>>> perf-profile.children.cycles-pp.ksys_write
>>>> 95.26 +0.9 96.20
>>>> perf-profile.children.cycles-pp.vfs_write
>>>> 94.91 +1.0 95.88
>>>> perf-profile.children.cycles-pp.__vfs_write
>>>> 94.84 +1.0 95.81
>>>> perf-profile.children.cycles-pp.btrfs_file_write_iter
>>>> 94.55 +1.0 95.55
>>>> perf-profile.children.cycles-pp.__btrfs_buffered_write
>>>> 86.68 +1.0 87.70
>>>> perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
>>>> 44.08 +1.2 45.31
>>>> perf-profile.children.cycles-pp.btrfs_delalloc_reserve_metadata
>>>> 43.49 +1.3 44.77
>>>> perf-profile.children.cycles-pp.reserve_metadata_bytes
>>>> 87.59 +1.8 89.38
>>>> perf-profile.children.cycles-pp._raw_spin_lock
>>>> 0.47 Â 19% -0.3 0.16 Â 13%
>>>> perf-profile.self.cycles-pp.intel_idle
>>>> 0.33 Â 6% -0.1 0.18 Â 6%
>>>> perf-profile.self.cycles-pp.get_alloc_profile
>>>> 0.27 Â 8% -0.0 0.22 Â 4%
>>>> perf-profile.self.cycles-pp.btrfs_drop_pages
>>>> 0.07 -0.0 0.03 Â100%
>>>> perf-profile.self.cycles-pp.btrfs_set_lock_blocking_rw
>>>> 0.14 Â 5% -0.0 0.12 Â 6%
>>>> perf-profile.self.cycles-pp.clear_page_dirty_for_io
>>>> 0.09 Â 5% -0.0 0.07 Â 10%
>>>> perf-profile.self.cycles-pp._raw_spin_lock_irqsave
>>>> 0.17 Â 4% +0.1 0.23 Â 3%
>>>> perf-profile.self.cycles-pp.reserve_metadata_bytes
>>>> 0.31 Â 7% +0.1 0.45 Â 2%
>>>> perf-profile.self.cycles-pp.can_overcommit
>>>> 86.35 +1.0 87.39
>>>> perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
>>>>
>>>>
>>>>
>>>> aim7.jobs-per-min
>>>>
>>>> 29000
>>>> +-+-----------------------------------------------------------------+
>>>> 28500 +-+ +.. + +..+..
>>>> +.. |
>>>> |..+ +.+..+.. : .. + .+.+..+..+.+.. .+..+.. +
>>>> + + |
>>>> 28000 +-+ + .. : + +. + + +
>>>> |
>>>> 27500 +-+ + +
>>>> |
>>>> |
>>>> |
>>>> 27000 +-+
>>>> |
>>>> 26500 +-+
>>>> |
>>>> 26000 +-+
>>>> |
>>>> |
>>>> |
>>>> 25500 +-+ O O O
>>>> O O |
>>>> 25000 +-+ O O O O O O O O
>>>> O
>>>> | O O O O O O O
>>>> O |
>>>> 24500 O-+O O O O
>>>> |
>>>> 24000
>>>> +-+-----------------------------------------------------------------+
>>>>
>>>>
>>>> [*] bisect-good sample
>>>> [O] bisect-bad sample
>>>>
>>>>
>>>> Disclaimer:
>>>> Results have been estimated based on internal Intel analysis and
>>>> are provided
>>>> for informational purposes only. Any difference in system hardware
>>>> or software
>>>> design or configuration may affect actual performance.
>>>>
>>>>
>>>> Thanks,
>>>> Xiaolong
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LKP mailing list
>>>> LKP@xxxxxxxxxxxx
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.01.org_mailman_listinfo_lkp&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=9QPtTAxcitoznaWRKKHoEQ&m=kkEXHhn9ofFgUoBrBpTiepWkkQeot8EjTaMlN_yKeyw&s=jS-aI15ofX4iTh_mcL91Pw4x1BdDPVhz6AWa0DQpSFY&e=
>>>>
>
>
> --
> Jens Axboe