[Regression or Fix]perf: profiling stats sigificantly changed for aio_write/read(ext4) between 6.7.0-rc1 and 6.6.0

From: David Wang
Date: Mon Nov 13 2023 - 23:32:07 EST


Hi,

I was making kernel profiling to identify kernel changes between revisions under the load of fio benchmark,
The profiler use perf_event_open with a cgroup where a fio run a 10minutes stress: fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --bs=4k --iodepth=64 --size=1G --readwrite=randrw --runtime=600 --numjobs=4 --time_based=1

I notice a ~4% change in the profiling data.
The profiling stat for 6.6.0:

aio_write(54.974% 117153/213108)
ext4_file_write_iter(97.395% 114101/117153)
iomap_dio_rw(91.195% 104054/114101)
__iomap_dio_rw(99.835% 103882/104054)
iomap_dio_bio_iter(68.800% 71471/103882) <<---
blk_finish_plug(22.146% 23006/103882) <<---
iomap_iter(6.161% 6400/103882)
kmalloc_trace(0.850% 883/103882)
kiocb_invalidate_pages(0.205% 213/103882)
invalidate_inode_pages2_range(0.130% 135/103882)
srso_return_thunk(0.109% 113/103882)


And for 6.7.0-rc1

aio_write(54.317% 96039/176813)
ext4_file_write_iter(97.482% 93621/96039)
iomap_dio_rw(91.211% 85393/93621)
__iomap_dio_rw(99.822% 85241/85393)
iomap_dio_bio_iter(64.793% 55230/85241) <<-----
blk_finish_plug(26.446% 22543/85241) <<----
iomap_iter(5.881% 5013/85241)
kmalloc_trace(0.682% 581/85241)
kiocb_invalidate_pages(0.230% 196/85241)
invalidate_inode_pages2_range(0.147% 125/85241)


The iomap_dio_bio_iter samples decrease 4% while blk_finish_plug increase 4%.

I have run a bisect, narrow down to following commit:
commit f06cc667f79909e9175460b167c277b7c64d3df0
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Mon Oct 9 23:04:25 2023 +0200

perf: Optimize perf_cgroup_switch()

And after I revert this commit, the profiling stat now almost agree with 6.6.0

aio_write(54.937% 104691/190564)
ext4_file_write_iter(97.586% 102164/104691)
iomap_dio_rw(91.257% 93232/102164)
__iomap_dio_rw(99.819% 93063/93232)
iomap_dio_bio_iter(69.654% 64822/93063) <<------
blk_finish_plug(21.724% 20217/93063) <<----
iomap_iter(5.859% 5453/93063)
kmalloc_trace(0.758% 705/93063)
kiocb_invalidate_pages(0.251% 234/93063)
invalidate_inode_pages2_range(0.134% 125/93063)


I have run several round of tests, mostly during the bisect process, and the ~4% change is consistent.
The profiler I used is an experimental tool, not perf though, I think the behavior changes it detected is legit.
https://github.com/zq-david-wang/linux-tools/blob/main/perf/profiler/profiler.cpp

But I did not notice a significant change in the overhead of perf tracing based on the fio report,
that is why I am not sure whether this is a regression or an improvement....
Just report the change.


Thanks
David Wang