Regression caused by using node_to_bdi()

From: Zhao Lei
Date: Fri Apr 10 2015 - 07:25:21 EST


Hi, Christoph Hellwig

resend: + cc lkml, linux-fsdevel

Since there is no response for my last mail, I worry that some problem in
the mail system, please allow me to resend it.

I found regression in v4.0-rc1 caused by this patch:
Author: Christoph Hellwig <hch@xxxxxx>
Date: Wed Jan 14 10:42:36 2015 +0100
fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info

Test process is following:
2015-02-25 15:50:22: Start
2015-02-25 15:50:22: Linux version:Linux btrfs 4.0.0-rc1_HEAD_c517d838eb7d07bbe9507871fab3931deccff539_ #1 SMP Wed Feb 25 10:59:10 CST 2015 x86_64 x86_64 x86_64 GNU/Linux
2015-02-25 15:50:25: mkfs.btrfs -f /dev/sdb1
2015-02-25 15:50:27: mount /dev/sdb1 /data/ltf/tester
2015-02-25 15:50:28: sysbench --test=fileio --num-threads=1 --file-num=1 --file-block-size=32768 --file-total-size=4G --file-test-mode=seqwr --file-io-mode=sync --file-extra-flags= --file-fsync-freq=0 --file-fsync-end=off --max-requests=131072
2015-02-25 15:51:40: done sysbench

Result is following:
v3.19-rc1: testcnt=40 average=135.677 range=[132.460,139.130] stdev=1.610 cv=1.19%
v4.0-rc1: testcnt=40 average=130.970 range=[127.980,132.050] stdev=1.012 cv=0.77%

Then I bisect above case between v3.19-rc1 and v4.0-rc1, and found this patch caused the regresstion.

Maybe it is because kernel need more time to call node_to_bdi(), compared with "using inode->i_mapping->backing_dev_info directly" in old code.

Is there some way to speed up it(inline, or some access some variant in struct directly, ...)?


--Relative data--

Performance data before and after this patch:
v3.19-rc5_00005_495a27 : io_speed: valcnt=10 avg=137.409 range=[134.820,139.000] diff=3.10% stdev=1.574 cv=1.15%
v3.19-rc5_00006_26ff13 : io_speed: valcnt=10 avg=136.534 range=[132.390,139.500] diff=5.37% stdev=2.659 cv=1.95%
v3.19-rc5_00007_de1414 : io_speed: valcnt=10 avg=130.358 range=[129.070,132.150] diff=2.39% stdev=1.120 cv=0.86% <- *this patch*
v3.19-rc5_00008_b83ae6 : io_speed: valcnt=10 avg=129.549 range=[129.200,129.910] diff=0.55% stdev=0.241 cv=0.19%
v3.19-rc5_00011_c4db59 : io_speed: valcnt=10 avg=130.033 range=[129.050,131.620] diff=1.99% stdev=0.854 cv=0.66%


sysbench's detail log in testing:
sysbench 0.4.12: multi-threaded system evaluation benchmark

1 files, 4194304Kb each, 4096Mb total
Creating files for the test...
sysbench 0.4.12: multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 1

Extra file open flags: 0
1 files, 4Gb each
4Gb total file size
Block size 32Kb
Using synchronous I/O mode
Doing sequential write (creation) test
Threads started!
Done.

Operations performed: 0 Read, 131072 Write, 0 Other = 131072 Total Read 0b Written 4Gb Total transferred 4Gb (132.15Mb/sec)
4228.75 Requests/sec executed

Test execution summary:
total time: 30.9955s
total number of events: 131072
total time taken by event execution: 30.8731
per-request statistics:
min: 0.01ms
avg: 0.24ms
max: 30.80ms
approx. 95 percentile: 0.03ms

Threads fairness:
events (avg/stddev): 131072.0000/0.00
execution time (avg/stddev): 30.8731/0.00

sysbench 0.4.12: multi-threaded system evaluation benchmark


My test env:
2G mem, 2-core machin, test is running on 1T sata disk.

[root@btrfs test_nosync_32768__sync_1_seqwr_4G_btrfs_1]# cat /proc/meminfo
MemTotal: 2015812 kB
MemFree: 627416 kB
MemAvailable: 1755488 kB
Buffers: 345876 kB
Cached: 772788 kB
SwapCached: 0 kB
Active: 848864 kB
Inactive: 320044 kB
Active(anon): 54128 kB
Inactive(anon): 5080 kB
Active(file): 794736 kB
Inactive(file): 314964 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 50140 kB
Mapped: 41636 kB
Shmem: 8984 kB
Slab: 200312 kB
SReclaimable: 187308 kB
SUnreclaim: 13004 kB
KernelStack: 1728 kB
PageTables: 4056 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 1007904 kB
Committed_AS: 205956 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 539968 kB
VmallocChunk: 34359195223 kB
HardwareCorrupted: 0 kB
AnonHugePages: 6144 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 61056 kB
DirectMap2M: 2000896 kB
[root@btrfs test_nosync_32768__sync_1_seqwr_4G_btrfs_1]# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz
stepping : 10
microcode : 0xa0b
cpu MHz : 1603.000
cache size : 3072 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dtherm tpr_shadow vnmi flexpriority
bugs :
bogomips : 5851.89
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz
stepping : 10
microcode : 0xa0b
cpu MHz : 1603.000
cache size : 3072 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dtherm tpr_shadow vnmi flexpriority
bugs :
bogomips : 5851.89
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

Thanks
Zhaolei



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/