Re: regression/bisected/6.9 commit 587d67fd929ad89801bcc429675bda90d53f6592 decrease 30% of gaming performance

From: Takashi Iwai
Date: Tue Apr 30 2024 - 01:51:09 EST


On Tue, 30 Apr 2024 01:00:07 +0200,
Mikhail Gavrilov wrote:
>
> On Wed, Apr 24, 2024 at 6:42 PM Takashi Iwai <tiwai@xxxxxxx> wrote:
> >
> > That said, maybe the first thing you can try would be to check who is
> > actually calling the corresponding function
> > (snd_timer_close_locked()). Put a debug print or a tracing hook to
> > watch out for figuring out. If the commit was really relevant, it
> > must be called very frequently and concurrently, and I don't know
> > really who does it except for dmix/dsnoop.
> >
>
> It's madness.
> Yes I added printk to snd_timer_close_locked and saw that function
> invoked only 3 times when the system booted.
> And never invoked during benchmark.
>
> > git diff
> diff --git a/Makefile b/Makefile
> index 40fb2ca6fe4c..3ecff79a23b8 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2,7 +2,7 @@
> VERSION = 6
> PATCHLEVEL = 9
> SUBLEVEL = 0
> -EXTRAVERSION = -rc6
> +EXTRAVERSION = -rc6-test-build
> NAME = Hurr durr I'ma ninja sloth
>
> # *DOCUMENTATION*
> diff --git a/sound/core/timer.c b/sound/core/timer.c
> index 4d2ee99c12a3..59d8e4698b0b 100644
> --- a/sound/core/timer.c
> +++ b/sound/core/timer.c
> @@ -407,7 +407,7 @@ static void snd_timer_close_locked(struct
> snd_timer_instance *timeri,
> struct device **card_devp_to_put)
> {
> struct snd_timer *timer = timeri->timer;
> -
> + printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
> if (timer) {
> guard(spinlock_irq)(&timer->lock);
> timeri->flags |= SNDRV_TIMER_IFLG_DEAD;
>
> [Tue Apr 30 03:33:25 2024] igc 0000:0a:00.0 eno1: NIC Link is Up 1000
> Mbps Full Duplex, Flow Control: RX/TX
> [Tue Apr 30 03:33:33 2024] rfkill: input handler disabled
> [Tue Apr 30 03:33:36 2024] DEBUG: Passed snd_timer_close_locked 410
> [Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM TTY layer initialized
> [Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM socket layer initialized
> [Tue Apr 30 03:33:36 2024] Bluetooth: RFCOMM ver 1.11
> [Tue Apr 30 03:33:51 2024] systemd-journald[949]:
> /var/log/journal/1b6a399fb0874de8b095a739fe2ff323/user-1000.journal:
> Journal file uses a different sequence number ID, rotating.
> [Tue Apr 30 03:33:52 2024] rfkill: input handler enabled
> [Tue Apr 30 03:33:55 2024] DEBUG: Passed snd_timer_close_locked 410
> [Tue Apr 30 03:33:55 2024] rfkill: input handler disabled
> [Tue Apr 30 03:33:57 2024] input: solaar-keyboard as
> /devices/virtual/input/input18
> [Tue Apr 30 03:34:08 2024] DEBUG: Passed snd_timer_close_locked 410
> [Tue Apr 30 03:34:11 2024] input: Noble FoKus Mystique (AVRCP) as
> /devices/virtual/input/input19
> [Tue Apr 30 03:34:23 2024] show_signal: 8 callbacks suppressed
> [Tue Apr 30 03:34:23 2024] traps: gldriverquery[4464] general
> protection fault ip:7ff83958c76f sp:7ffc464e0e00 error:0 in
> libLLVM.so.18.1[7ff83920c000+3afd000]
> [Tue Apr 30 03:34:25 2024] workqueue: gc_worker [nf_conntrack] hogged
> CPU for >10000us 5 times, consider switching to WQ_UNBOUND
> [Tue Apr 30 03:34:26 2024] workqueue: gc_worker [nf_conntrack] hogged
> CPU for >10000us 7 times, consider switching to WQ_UNBOUND
> [Tue Apr 30 03:35:27 2024] workqueue: gc_worker [nf_conntrack] hogged
> CPU for >10000us 11 times, consider switching to WQ_UNBOUND
> [Tue Apr 30 03:36:28 2024] workqueue: gc_worker [nf_conntrack] hogged
> CPU for >10000us 19 times, consider switching to WQ_UNBOUND
> [Tue Apr 30 03:39:33 2024] workqueue: gc_worker [nf_conntrack] hogged
> CPU for >10000us 35 times, consider switching to WQ_UNBOUND
>
> I saw a similar picture with ftrace.
> I had no experience with ftrace, so I am laying out everything here
> for you to see if I am doing something wrong or not.
>
> # echo "snd_timer_close_locked" > /sys/kernel/tracing/set_ftrace_filter
>
> # echo "function" > /sys/kernel/tracing/current_tracer
>
> # echo 1 > /sys/kernel/tracing/tracing_on
>
> /* Benchmark the game here */
>
> # echo 0 > /sys/kernel/tracing/tracing_on
>
> # cat /sys/kernel/tracing/trace
> # tracer: function
> #
> # entries-in-buffer/entries-written: 0/0 #P:32
> #
> # _-----=> irqs-off/BH-disabled
> # / _----=> need-resched
> # | / _---=> hardirq/softirq
> # || / _--=> preempt-depth
> # ||| / _-=> migrate-disable
> # |||| / delay
> # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> # | | | ||||| | |
>
> # cat /sys/kernel/tracing/set_ftrace_filter
> snd_timer_close_locked [snd_timer]
>
> Now I am tormented by the question: how can reverting code that is not
> invoked directly during a benchmark affect the benchmark result so
> much?

It implies that it's not about the revert that influences on the
result. It might be rather the fact that you built the kernel by
yourself. Try to build the kernel again with a debug print but
without the revert of the spinlock. Does it still show the same
improvement? If so, try again without the debug print and retest.


Takashi