Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased execution time of the kswapd0 process and symptoms as if there is not enough memory

From: Andrea Gelmini
Date: Fri Jul 05 2024 - 20:12:21 EST


Il giorno gio 4 lug 2024 alle ore 19:25 Filipe Manana
<fdmanana@xxxxxxxxxx> ha scritto:
> 2) Then drop that patch that disables the shrinker.
> With all the previous 4 patches applied, apply this one on top of them:
>
> https://gist.githubusercontent.com/fdmanana/9cea16ca56594f8c7e20b67dc66c6c94/raw/557bd5f6b37b65d210218f8da8987b74bfe5e515/gistfile1.txt
>
> The goal here is to see if the extent map eviction done by the
> shrinker is making reads from other tasks too slow, and check if
> that's what0s making your system unresponsive.
>
> 3) Then drop the patch from step 2), and on top of the previous 4
> patches from my git tree, apply this one:
>
> https://gist.githubusercontent.com/fdmanana/a7c9c2abb69c978cf5b80c2f784243d5/raw/b4cca964904d3ec15c74e36ccf111a3a2f530520/gistfile1.txt
>
> This is just to confirm if we do have concurrent calls to the
> shrinker, as the tracing seems to suggest, and where the negative
> numbers come from.
> It also helps to check if not allowing concurrent calls to it, by
> skipping if it's already running, helps making the problems go away.

Uhm... good news...
To recap, here's this evening tests:

Kernel 6.6.36:
Fresh BTRFS: (tar cp . | pv -ta > /dev/null): 0:03:53 [ 231MiB/s]
(time and average speed)
Aged snapshots: (tar cp /.snapshots/|pv -at -s 100G -S >
/dev/null): 0:02:20 [ 726MiB/s]

Kernel rc6+branch+2nd patch:
Fresh BTRFS: 0:03:14 [ 278MiB/s]
Aged snapshots: I had to stop. PSI memory > 80%. Processes stucked
for most time. i.e.: mplayer via nfs stops every few seconds for a
while, switching virtual desktop takes >5 seconds. Also "echo 3 >
drop_caches" takes more than 5 minutes to finish (on the other two
kernels, it was quite immediate).

Kernel rc6+branch+3rd patch:
Fresh BTRFS: 0:03:40 [ 245MiB/s]
Aged snapshots: 0:02:03 [ 826MiB/s]
N.b.: no skyrocket PSI memory, no swap pressure, no sluggish results!!!

Now, that was just one run, I'm going to use this patch for a few
days. Next week I can tell you for sure if everything is right!
For the moment it seems we have a winner!