Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory

From: Filipe Manana
Date: Sat Jul 06 2024 - 13:38:09 EST


On Sat, Jul 6, 2024 at 1:07 PM Andrea Gelmini <andrea.gelmini@xxxxxxxxx> wrote:
>
> Il giorno sab 6 lug 2024 alle ore 02:11 Andrea Gelmini
> <andrea.gelmini@xxxxxxxxx> ha scritto:
> > For the moment it seems we have a winner!
>
> I confirm this, but I forgot to add this (a lot of these):

Oh, those I added on purpose to confirm what the bpftrace logs
suggested: concurrent calls into the shrinker.


> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm firefox-bin nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm firefox-bin nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
> [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> shrinker already running, comm cc1plus nr_to_scan 2
>
> Just for the record, compiling LibreOffice.
>
> In the meanwhile running restic (full backup to force read
> everything), no sluggish at all.

That's great!

So I've been working on a proper approach following all those test
results from you and Mikhail, and I would like to ask you both to try
this branch:

https://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux.git/log/?h=test3_em_shrinker_6.10

Again, this is based on 6.10-rc6 plus 3 fixes for this issue you're both having.

Can you guys test that branch?

Thank you a lot for all the time spent on this!