RE: Regression caused by using node_to_bdi()

From: Zhao Lei
Date: Wed Apr 01 2015 - 05:57:15 EST


Hi, Christoph

*From: Zhao Lei [mailto:zhaolei@xxxxxxxxxxxxxx]
> Sent: Monday, March 09, 2015 10:47 AM
> To: 'Christoph Hellwig'; 'Jan Kara'
> Cc: 'Tejun Heo'; 'Jens Axboe'
> Subject: RE: Regression caused by using node_to_bdi()
>
> Hi, Christoph and Jan
>
> * From: 'Christoph Hellwig' [mailto:hch@xxxxxx]
> > Sent: Sunday, March 08, 2015 11:34 PM
> > To: Jan Kara
> > Cc: Zhao Lei; 'Christoph Hellwig'; 'Tejun Heo'; 'Jens Axboe'
> > Subject: Re: Regression caused by using node_to_bdi()
> >
> > On Sun, Mar 08, 2015 at 11:29:16AM +0100, Jan Kara wrote:
> > > Frankly, I doubt the cost of inode_to_bdi() is the reason for the
> > > slowdown here. If I read the numbers right, the throughput dropped
> > > from 135 MB/s on average to 130 MB/s on average. Such load is hardly
> > > going to saturate the CPU enough for additional cycles in
> > > inode_to_bdi() to
> > matter.
> > > The load like this is completely IO bound unless you have really
> > > fast drive (doing GB/s). What are the throughput number just before
> > > / after this commit?\
>
> These are performance data before and after this patch In bisect:

What is your opinion about this regression?
Please tell me if you need additional test and result on my env.

Thanks
Zhaolei

>
> v3.19-rc5_00005_495a27 : io_speed: valcnt=10 avg=137.409
> range=[134.820,139.000] diff=3.10% stdev=1.574 cv=1.15%
> v3.19-rc5_00006_26ff13 : io_speed: valcnt=10 avg=136.534
> range=[132.390,139.500] diff=5.37% stdev=2.659 cv=1.95%
> v3.19-rc5_00007_de1414 : io_speed: valcnt=10 avg=130.358
> range=[129.070,132.150] diff=2.39% stdev=1.120 cv=0.86% <- *this patch*
> v3.19-rc5_00008_b83ae6 : io_speed: valcnt=10 avg=129.549
> range=[129.200,129.910] diff=0.55% stdev=0.241 cv=0.19%
> v3.19-rc5_00011_c4db59 : io_speed: valcnt=10 avg=130.033
> range=[129.050,131.620] diff=1.99% stdev=0.854 cv=0.66%
>
>
> > What is the CPU load while the benchmark is running?
> >
> I hadn't record cpu load in testing, I'll do it if it is necessary for debug.
>
> These are of one of sysbench's log:
>
> sysbench 0.4.12: multi-threaded system evaluation benchmark
>
> 1 files, 4194304Kb each, 4096Mb total
> Creating files for the test...
> sysbench 0.4.12: multi-threaded system evaluation benchmark
>
> Running the test with following options:
> Number of threads: 1
>
> Extra file open flags: 0
> 1 files, 4Gb each
> 4Gb total file size
> Block size 32Kb
> Using synchronous I/O mode
> Doing sequential write (creation) test
> Threads started!
> Done.
>
> Operations performed: 0 Read, 131072 Write, 0 Other = 131072 Total Read
> 0b Written 4Gb Total transferred 4Gb (132.15Mb/sec)
> 4228.75 Requests/sec executed
>
> Test execution summary:
> total time: 30.9955s
> total number of events: 131072
> total time taken by event execution: 30.8731
> per-request statistics:
> min: 0.01ms
> avg: 0.24ms
> max: 30.80ms
> approx. 95 percentile: 0.03ms
>
> Threads fairness:
> events (avg/stddev): 131072.0000/0.00
> execution time (avg/stddev): 30.8731/0.00
>
> sysbench 0.4.12: multi-threaded system evaluation benchmark
>
>
> > How much memory does the machine have?
> >
> 2G mem, 2-core machin, test is running on 1T sata disk.
>
> [root@btrfs test_nosync_32768__sync_1_seqwr_4G_btrfs_1]# cat
> /proc/meminfo
> MemTotal: 2015812 kB
> MemFree: 627416 kB
> MemAvailable: 1755488 kB
> Buffers: 345876 kB
> Cached: 772788 kB
> SwapCached: 0 kB
> Active: 848864 kB
> Inactive: 320044 kB
> Active(anon): 54128 kB
> Inactive(anon): 5080 kB
> Active(file): 794736 kB
> Inactive(file): 314964 kB
> Unevictable: 0 kB
> Mlocked: 0 kB
> SwapTotal: 0 kB
> SwapFree: 0 kB
> Dirty: 0 kB
> Writeback: 0 kB
> AnonPages: 50140 kB
> Mapped: 41636 kB
> Shmem: 8984 kB
> Slab: 200312 kB
> SReclaimable: 187308 kB
> SUnreclaim: 13004 kB
> KernelStack: 1728 kB
> PageTables: 4056 kB
> NFS_Unstable: 0 kB
> Bounce: 0 kB
> WritebackTmp: 0 kB
> CommitLimit: 1007904 kB
> Committed_AS: 205956 kB
> VmallocTotal: 34359738367 kB
> VmallocUsed: 539968 kB
> VmallocChunk: 34359195223 kB
> HardwareCorrupted: 0 kB
> AnonHugePages: 6144 kB
> HugePages_Total: 0
> HugePages_Free: 0
> HugePages_Rsvd: 0
> HugePages_Surp: 0
> Hugepagesize: 2048 kB
> DirectMap4k: 61056 kB
> DirectMap2M: 2000896 kB
> [root@btrfs test_nosync_32768__sync_1_seqwr_4G_btrfs_1]# cat
> /proc/cpuinfo
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 23
> model name : Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz
> stepping : 10
> microcode : 0xa0b
> cpu MHz : 1603.000
> cache size : 3072 KB
> physical id : 0
> siblings : 2
> core id : 0
> cpu cores : 2
> apicid : 0
> initial apicid : 0
> fpu : yes
> fpu_exception : yes
> cpuid level : 13
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm
> constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64
> monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dtherm
> tpr_shadow vnmi flexpriority
> bugs :
> bogomips : 5851.89
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
> power management:
>
> processor : 1
> vendor_id : GenuineIntel
> cpu family : 6
> model : 23
> model name : Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz
> stepping : 10
> microcode : 0xa0b
> cpu MHz : 1603.000
> cache size : 3072 KB
> physical id : 0
> siblings : 2
> core id : 1
> cpu cores : 2
> apicid : 1
> initial apicid : 1
> fpu : yes
> fpu_exception : yes
> cpuid level : 13
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm
> constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64
> monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dtherm
> tpr_shadow vnmi flexpriority
> bugs :
> bogomips : 5851.89
> clflush size : 64
> cache_alignment : 64
> address sizes : 36 bits physical, 48 bits virtual
> power management:
>
> [root@btrfs test_nosync_32768__sync_1_seqwr_4G_btrfs_1]#
>
> Please tell me if you are interesting on more information or operation.
>
> Thanks
> Zhaolei
>
> > I remember an issue a few years ago where simply reverting a path that
> > uninlined the rw_sem code fixed a buffered I/O performance regression
> > when using Samba on a very low end arm device, so everything is possible.
> >
> > I'd still like to ensure the numbers are reproducible in this case
> > first, and look at all the information Jan asked for. Ask a next step
> > we could then look at using an inline version to check if thast helps.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/