Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory
From: Filipe Manana
Date: Sun Jul 07 2024 - 05:41:55 EST
On Sat, Jul 6, 2024 at 6:37 PM Filipe Manana <fdmanana@xxxxxxxxxx> wrote:
>
> On Sat, Jul 6, 2024 at 1:07 PM Andrea Gelmini <andrea.gelmini@xxxxxxxxx> wrote:
> >
> > Il giorno sab 6 lug 2024 alle ore 02:11 Andrea Gelmini
> > <andrea.gelmini@xxxxxxxxx> ha scritto:
> > > For the moment it seems we have a winner!
> >
> > I confirm this, but I forgot to add this (a lot of these):
>
> Oh, those I added on purpose to confirm what the bpftrace logs
> suggested: concurrent calls into the shrinker.
>
>
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm firefox-bin nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm firefox-bin nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:06 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> > [sab lug 6 13:12:07 2024] BTRFS warning (device dm-0): extent
> > shrinker already running, comm cc1plus nr_to_scan 2
> >
> > Just for the record, compiling LibreOffice.
> >
> > In the meanwhile running restic (full backup to force read
> > everything), no sluggish at all.
>
> That's great!
>
> So I've been working on a proper approach following all those test
> results from you and Mikhail, and I would like to ask you both to try
> this branch:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/fdmanana/linux.git/log/?h=test3_em_shrinker_6.10
>
> Again, this is based on 6.10-rc6 plus 3 fixes for this issue you're both having.
>
> Can you guys test that branch?
I just updated the branch with a last minute change to avoid an
unnecessary reschedule and re-lock, therefore helping reduce latency.
Thanks.
>
> Thank you a lot for all the time spent on this!