Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

From: Mikhail Gavrilov
Date: Mon Apr 29 2024 - 19:00:29 EST


On Wed, Apr 24, 2024 at 6:42 PM Takashi Iwai <tiwai@xxxxxxx> wrote:
>
> That said, maybe the first thing you can try would be to check who is
> actually calling the corresponding function
> (snd_timer_close_locked()). Put a debug print or a tracing hook to
> watch out for figuring out. If the commit was really relevant, it
> must be called very frequently and concurrently, and I don't know
> really who does it except for dmix/dsnoop.
>

It's madness.
Yes I added printk to snd_timer_close_locked and saw that function
invoked only 3 times when the system booted.
And never invoked during benchmark.

> git diff
diff --git a/Makefile b/Makefile
index 40fb2ca6fe4c..3ecff79a23b8 100644
--- a/Makefile
+++ b/Makefile
@@ -2,7 +2,7 @@
VERSION = 6
PATCHLEVEL = 9
SUBLEVEL = 0
-EXTRAVERSION = -rc6
+EXTRAVERSION = -rc6-test-build
NAME = Hurr durr I'ma ninja sloth

# *DOCUMENTATION*
diff --git a/sound/core/timer.c b/sound/core/timer.c
index 4d2ee99c12a3..59d8e4698b0b 100644
--- a/sound/core/timer.c
+++ b/sound/core/timer.c
@@ -407,7 +407,7 @@ static void snd_timer_close_locked(struct
snd_timer_instance *timeri,
struct device **card_devp_to_put)
{
struct snd_timer *timer = timeri->timer;
-
+ printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
if (timer) {
guard(spinlock_irq)(&timer->lock);
timeri->flags |= SNDRV_TIMER_IFLG_DEAD;

[Tue Apr 30 03:33:25 2024] igc 0000:0a:00.0 eno1: NIC Link is Up 1000
Mbps Full Duplex, Flow Control: RX/TX
[Tue Apr 30 03:33:33 2024] rfkill: input handler disabled
[Tue Apr 30 03:33:36 2024] DEBUG: Passed snd_timer_close_locked 410
[Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM TTY layer initialized
[Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM socket layer initialized
[Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM ver 1.11
[Tue Apr 30 03:33:51 2024] systemd-journald[949]:
/var/log/journal/1b6a399fb0874de8b095a739fe2ff323/user-1000.journal:
Journal file uses a different sequence number ID, rotating.
[Tue Apr 30 03:33:52 2024] rfkill: input handler enabled
[Tue Apr 30 03:33:55 2024] DEBUG: Passed snd_timer_close_locked 410
[Tue Apr 30 03:33:55 2024] rfkill: input handler disabled
[Tue Apr 30 03:33:57 2024] input: solaar-keyboard as
/devices/virtual/input/input18
[Tue Apr 30 03:34:08 2024] DEBUG: Passed snd_timer_close_locked 410
[Tue Apr 30 03:34:11 2024] input: Noble FoKus Mystique (AVRCP) as
/devices/virtual/input/input19
[Tue Apr 30 03:34:23 2024] show_signal: 8 callbacks suppressed
[Tue Apr 30 03:34:23 2024] traps: gldriverquery[4464] general
protection fault ip:7ff83958c76f sp:7ffc464e0e00 error:0 in
libLLVM.so.18.1[7ff83920c000+3afd000]
[Tue Apr 30 03:34:25 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 5 times, consider switching to WQ_UNBOUND
[Tue Apr 30 03:34:26 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 7 times, consider switching to WQ_UNBOUND
[Tue Apr 30 03:35:27 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 11 times, consider switching to WQ_UNBOUND
[Tue Apr 30 03:36:28 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 19 times, consider switching to WQ_UNBOUND
[Tue Apr 30 03:39:33 2024] workqueue: gc_worker [nf_conntrack] hogged
CPU for >10000us 35 times, consider switching to WQ_UNBOUND

I saw a similar picture with ftrace.
I had no experience with ftrace, so I am laying out everything here
for you to see if I am doing something wrong or not.

# echo "snd_timer_close_locked" > /sys/kernel/tracing/set_ftrace_filter

# echo "function" > /sys/kernel/tracing/current_tracer

# echo 1 > /sys/kernel/tracing/tracing_on

/* Benchmark the game here */

# echo 0 > /sys/kernel/tracing/tracing_on

# cat /sys/kernel/tracing/trace
# tracer: function
#
# entries-in-buffer/entries-written: 0/0 #P:32
#
# _-----=> irqs-off/BH-disabled
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / _-=> migrate-disable
# |||| / delay
# TASK-PID CPU# ||||| TIMESTAMP FUNCTION
# | | | ||||| | |

# cat /sys/kernel/tracing/set_ftrace_filter
snd_timer_close_locked [snd_timer]

Now I am tormented by the question: how can reverting code that is not
invoked directly during a benchmark affect the benchmark result so
much?
I'm afraid that I won't be able to figure this out on my own.

--
Best Regards,
Mike Gavrilov.