Re: [PATCH v6.13] fs/netfs/read_pgpriv2: skip folio queues without `marks3`

From: Max Kellermann
Date: Tue Feb 11 2025 - 03:06:07 EST


On Tue, Feb 11, 2025 at 7:30 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:

> > Note this patch doesn't apply to v6.14 as it was obsoleted by commit
> > e2d46f2ec332 ("netfs: Change the read result collector to only use one
> > work item").
>
> Why can't we just take what is upstream instead?
>
> Diverging from that ALWAYS ends up being more work and problems in the
> end. Only do so if you have no other choice.

Usually I agree with that, and I trust that you will make the right decision.

Before you decide, let me point out that netfs has been extremely
unstable since 6.10 (July 2024), ever since commit 2e9d7e4b984a ("mm:
Remove the PG_fscache alias for PG_private_2). All of our web servers
have been crashing since 6.10 all the time (see
https://lore.kernel.org/netfs/CAKPOu+_DA8XiMAA2ApMj7Pyshve_YWknw8Hdt1=zCy9Y87R1qw@xxxxxxxxxxxxxx/
for one of several bug reports I posted), and I went through
considerable trouble by resisting the pressure from people asking me
to downgrade to 6.6. (I want the bugs fixed, I don't want to go back.)
For several months, 6.12 had been crashing instantly on boot due to
yet another netfs regression
(https://lore.kernel.org/netfs/CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY4syqw@xxxxxxxxxxxxxx/)
which wasn't fixed until 6.12.11, so our production servers are still
on 6.11.11 today.
Before these bugs got ironed out, v6.11 commit ee4cdf7ba857 ("netfs:
Speed up buffered reading") wreaked more havoc, leading to 8 "Fixes"
commits so far, plus the 2 I posted yesterday.

I wrote that e2d46f2ec332 has obsoleted my patch (actually both
patches), but I don't know if it really fixes both bugs. The code that
was buggy does not exist anymore in the form that my patch addresses,
but I don't know if it was just refactored and the bugs were kept (or
maybe yet more bugs sneaked in).


Max