pgbench buffer readwrite mode showing around 24% perf regression

From: samasth . norway . ananda
Date: Mon Mar 03 2025 - 15:37:28 EST



Hi,


We recently discovered a performance regression while running pgbench buffer readwrite metric over the 6.12 kernel.
After bisecting we were able to narrow it down to the commit
9a42891c35d50a8472b42c61256867b4dfcc1941 (“block: fix lost bio for plug enabled bio based device”)

The postgresql db used for this benchmark is stored on an xfs filesystem on top of a stripe across 6 disks.
lsblk output -

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 557.9G 0 disk
└─sda1 8:1 0 557.9G 0 part
└─tank-lvm 252:0 0 3.3T 0 lvm /data
sdb 8:16 0 557.9G 0 disk
└─sdb1 8:17 0 557.9G 0 part
└─tank-lvm 252:0 0 3.3T 0 lvm /data
sdc 8:32 0 557.9G 0 disk
└─sdc1 8:33 0 557.9G 0 part
└─tank-lvm 252:0 0 3.3T 0 lvm /data
sdd 8:48 0 557.9G 0 disk
sde 8:64 0 557.9G 0 disk
└─sde1 8:65 0 557.9G 0 part
└─tank-lvm 252:0 0 3.3T 0 lvm /data
sdf 8:80 0 557.9G 0 disk
└─sdf1 8:81 0 557.9G 0 part
└─tank-lvm 252:0 0 3.3T 0 lvm /data
sdg 8:96 0 557.9G 0 disk
└─sdg1 8:97 0 557.9G 0 part
└─tank-lvm 252:0 0 3.3T 0 lvm /data
sdh 8:112 0 557.9G 0 disk
├─sdh1 8:113 0 1M 0 part
├─sdh2 8:114 0 9.8G 0 part /boot
├─sdh3 8:115 0 32G 0 part /
├─sdh4 8:116 0 1K 0 part
├─sdh5 8:117 0 16G 0 part [SWAP]
└─sdh6 8:118 0 500.1G 0 part /export/bench
0 /var/crash


The regression was only observed on some architectures.
I ran perf on the pqbench operation and found the following report.

Base kernel based on 6.12 -

# Overhead Command Shared Object Symbol
# ........ ............... ........................ .........................................
#
8.34% pgbench [kernel.kallsyms] [k] update_sg_lb_stats.isra.0
3.34% pgbench [kernel.kallsyms] [k] entry_SYSRETQ_unsafe_stack
2.57% pgbench [kernel.kallsyms] [k] sched_balance_find_src_rq
2.44% pgbench [kernel.kallsyms] [k] _find_next_and_bit
2.08% pgbench [kernel.kallsyms] [k] idle_cpu
1.99% pgbench [kernel.kallsyms] [k] __raw_spin_lock_irqsave
1.56% pgbench [kernel.kallsyms] [k] _raw_spin_lock
1.31% pgbench [kernel.kallsyms] [k] native_queued_spin_lock_slowpath
1.31% pgbench libpq.so.private16-5.16 [.] printfPQExpBuffer
1.13% pgbench [kernel.kallsyms] [k] cpu_util.constprop.0
1.11% pgbench [kernel.kallsyms] [k] syscall_return_via_sysret
1.10% pgbench [kernel.kallsyms] [k] rep_movs_alternative
1.08% pgbench [kernel.kallsyms] [k] unix_poll
----------------------cut here----------------------------

Base kernel based on 6.12 – with reverting the bisected commit -

Overhead Command Shared Object Symbol
# ........ ............... ....................... .........................................
#
9.19% pgbench [kernel.kallsyms] [k] update_sg_lb_stats.isra.0
3.44% pgbench [kernel.kallsyms] [k] entry_SYSRETQ_unsafe_stack
3.39% pgbench [kernel.kallsyms] [k] _find_next_and_bit
1.85% pgbench [kernel.kallsyms] [k] __raw_spin_lock_irqsave
1.83% pgbench [kernel.kallsyms] [k] idle_cpu
1.79% pgbench [kernel.kallsyms] [k] sched_balance_find_src_rq
1.61% pgbench [kernel.kallsyms] [k] _raw_spin_lock
1.21% pgbench [kernel.kallsyms] [k] entry_SYSCALL_64
1.16% pgbench [kernel.kallsyms] [k] unix_poll
1.16% pgbench [kernel.kallsyms] [k] syscall_return_via_sysret
1.12% pgbench [kernel.kallsyms] [k] cpu_util.constprop.0
1.09% pgbench [kernel.kallsyms] [k] native_queued_spin_lock_slowpath
----------------------cut here----------------------------

Please let me know if you need more information.

Thanks,
Samasth.