Re: [PATCH v6 0/9] Multigenerational LRU Framework

From: Michal Hocko
Date: Mon Jan 10 2022 - 10:39:56 EST

Next message: Alexander Potapenko: "Re: [syzbot] KMSAN: kernel-usb-infoleak in usbnet_write_cmd (3)"
Previous message: Rafael J. Wysocki: "Re: [PATCH] ACPI: pfr_telemetry: Fix info leak in pfrt_log_ioctl()"
In reply to: Alexey Avramov: "Re: [PATCH v6 0/9] Multigenerational LRU Framework"
Next in thread: Yu Zhao: "Re: [PATCH v6 0/9] Multigenerational LRU Framework"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri 07-01-22 11:45:40, Yu Zhao wrote:
[...]
> Next, I argue that the benefits of this patchset outweigh its risks,
> because, drawing from my past experience,
> 1. There have been many larger and/or riskier patchsets taken; I'll
> assemble a list if you disagree.

No question about that. Changes in the reclaim path are paved with
failures and reverts and fine tuning on top of existing fine tuning.
The difference from your patchset is that they tend to be much much
smaller and go incremental and therefore easier to review.

> And this patchset is fully guarded
> by #ifdef; Linus has also assessed on this point.

I appreciate you made the new behavior an opt-in and therefore existing
workloads are less likely to regress. I do not think ifdefs help
all that much, though, because a) realistically the config will
likely be enabled for most distribution kernels and b) the parallel
reclaim implementation adds a maintenance overhead regardless of those
ifdef. The later point is especially worrying because the memory reclaim
is a complex and hard to review beast already. Any future changes would
need to consider both reclaim algorithms of course.

Hence I argue we really need a wider consensus this is the right
direction we want to pursue.

> 2. There have been none that came with the testing/benchmarking
> coverage as this one did. Please point me to some if I'm mistaken,
> and I'll gladly match them.

I do appreciate your numbers but you should realize that this is an area
that is really hard to get any conclusive testing for. We keep learning
about fallouts on workloads we haven't really anticipated or where the
runtime effects happen to disagree with our intuition. So while those
numbers are nice there are other important aspects to consider like the
maintenance cost for example.

> The numbers might not materialize in the real world; the code is not
> perfect; and many other risks... But all the top eight open source
> memory hogs were covered, which is unprecedented; memcached and fio
> showed significant improvements and it only takes a few commands to
> see for yourselves.
>
> Regarding the acks and the reviewed-bys, I certainly can ask people
> who have reaped the benefits of this patchset to do them, if it's
> required. But I see less fun in that. I prefer to provide empirical
> evidence and convince people who are on the other side of the aisle.

I like to hear from users who benefit from your work and that certainly
gives more credit to it. But it will be the MM community to maintain the
code and address future issues.

We do not have a dedicated maintainer for the memory reclaim but
certainly there are people who have helped shaping the existing code and
have learned a lot from the past issues - like Johannes, Rik, Mel just
to name few. If I were you I would be really looking into finding an
agreement with them. I myself can help you with memcg and oom side of
the things (we already have discussions about those).

Thanks!
--
Michal Hocko
SUSE Labs

Next message: Alexander Potapenko: "Re: [syzbot] KMSAN: kernel-usb-infoleak in usbnet_write_cmd (3)"
Previous message: Rafael J. Wysocki: "Re: [PATCH] ACPI: pfr_telemetry: Fix info leak in pfrt_log_ioctl()"
In reply to: Alexey Avramov: "Re: [PATCH v6 0/9] Multigenerational LRU Framework"
Next in thread: Yu Zhao: "Re: [PATCH v6 0/9] Multigenerational LRU Framework"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]