pgbench buffer readwrite mode showing around 24% perf regression
From: samasth . norway . ananda
Date: Mon Mar 03 2025 - 15:37:28 EST
Hi,
We recently discovered a performance regression while running pgbench
buffer readwrite metric over the 6.12 kernel.
After bisecting we were able to narrow it down to the commit
9a42891c35d50a8472b42c61256867b4dfcc1941 (“block: fix lost bio for plug
enabled bio based device”)
The postgresql db used for this benchmark is stored on an xfs filesystem
on top of a stripe across 6 disks.
lsblk output -
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 557.9G 0 disk
└─sda1 8:1 0 557.9G 0 part
└─tank-lvm 252:0 0 3.3T 0 lvm /data
sdb 8:16 0 557.9G 0 disk
└─sdb1 8:17 0 557.9G 0 part
└─tank-lvm 252:0 0 3.3T 0 lvm /data
sdc 8:32 0 557.9G 0 disk
└─sdc1 8:33 0 557.9G 0 part
└─tank-lvm 252:0 0 3.3T 0 lvm /data
sdd 8:48 0 557.9G 0 disk
sde 8:64 0 557.9G 0 disk
└─sde1 8:65 0 557.9G 0 part
└─tank-lvm 252:0 0 3.3T 0 lvm /data
sdf 8:80 0 557.9G 0 disk
└─sdf1 8:81 0 557.9G 0 part
└─tank-lvm 252:0 0 3.3T 0 lvm /data
sdg 8:96 0 557.9G 0 disk
└─sdg1 8:97 0 557.9G 0 part
└─tank-lvm 252:0 0 3.3T 0 lvm /data
sdh 8:112 0 557.9G 0 disk
├─sdh1 8:113 0 1M 0 part
├─sdh2 8:114 0 9.8G 0 part /boot
├─sdh3 8:115 0 32G 0 part /
├─sdh4 8:116 0 1K 0 part
├─sdh5 8:117 0 16G 0 part [SWAP]
└─sdh6 8:118 0 500.1G 0 part /export/bench
0 /var/crash
The regression was only observed on some architectures.
I ran perf on the pqbench operation and found the following report.
Base kernel based on 6.12 -
# Overhead Command Shared Object Symbol
# ........ ............... ........................
.........................................
#
8.34% pgbench [kernel.kallsyms] [k]
update_sg_lb_stats.isra.0
3.34% pgbench [kernel.kallsyms] [k]
entry_SYSRETQ_unsafe_stack
2.57% pgbench [kernel.kallsyms] [k]
sched_balance_find_src_rq
2.44% pgbench [kernel.kallsyms] [k]
_find_next_and_bit
2.08% pgbench [kernel.kallsyms] [k] idle_cpu
1.99% pgbench [kernel.kallsyms] [k]
__raw_spin_lock_irqsave
1.56% pgbench [kernel.kallsyms] [k] _raw_spin_lock
1.31% pgbench [kernel.kallsyms] [k]
native_queued_spin_lock_slowpath
1.31% pgbench libpq.so.private16-5.16 [.]
printfPQExpBuffer
1.13% pgbench [kernel.kallsyms] [k]
cpu_util.constprop.0
1.11% pgbench [kernel.kallsyms] [k]
syscall_return_via_sysret
1.10% pgbench [kernel.kallsyms] [k]
rep_movs_alternative
1.08% pgbench [kernel.kallsyms] [k] unix_poll
----------------------cut here----------------------------
Base kernel based on 6.12 – with reverting the bisected commit -
Overhead Command Shared Object Symbol
# ........ ............... .......................
.........................................
#
9.19% pgbench [kernel.kallsyms] [k]
update_sg_lb_stats.isra.0
3.44% pgbench [kernel.kallsyms] [k]
entry_SYSRETQ_unsafe_stack
3.39% pgbench [kernel.kallsyms] [k]
_find_next_and_bit
1.85% pgbench [kernel.kallsyms] [k]
__raw_spin_lock_irqsave
1.83% pgbench [kernel.kallsyms] [k] idle_cpu
1.79% pgbench [kernel.kallsyms] [k]
sched_balance_find_src_rq
1.61% pgbench [kernel.kallsyms] [k] _raw_spin_lock
1.21% pgbench [kernel.kallsyms] [k] entry_SYSCALL_64
1.16% pgbench [kernel.kallsyms] [k] unix_poll
1.16% pgbench [kernel.kallsyms] [k]
syscall_return_via_sysret
1.12% pgbench [kernel.kallsyms] [k]
cpu_util.constprop.0
1.09% pgbench [kernel.kallsyms] [k]
native_queued_spin_lock_slowpath
----------------------cut here----------------------------
Please let me know if you need more information.
Thanks,
Samasth.