Re: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

From: Barry Song

Date: Tue Jun 02 2026 - 18:31:50 EST


On Wed, Jun 3, 2026 at 3:57 AM Harry Yoo <harry@xxxxxxxxxx> wrote:
>
>
>
> On 6/2/26 11:15 AM, Barry Song wrote:
> > On Mon, Jun 1, 2026 at 9:46 AM wangtao <tao.wangtao@xxxxxxxxx> wrote:
> > [...]
> >>
> >> You said discussion was welcome, yet when someone offered even a
> >> small comment, you refused to continue the discussion.
> >>
> >> If I had known you would be this inconsistent, I would not have
> >> replied to you in the first place.
> >>
> >> This will be my last reply to you. I will not respond again.
> >
> > Hi Tao,
> >
> > Please don't walk away from the linux-mm community. I read your
> > patchset and found it quite valuable. It not only reduces memory
> > overhead, but also eliminates rmap costs for exclusive folios.
> >
> > Since I'm not very confident discussing technical topics in English,
> > I wrote a blog post in Chinese about your patchset:
> >
> > https://mp.weixin.qq.com/s/k00tzhTl8HbL3k4G6ev4SA
> The cover letter and commit messages should have been elaborated to a
> much greater degree instead of making people guess the design and intent
> from the code.

Indeed. The cover letter does not clearly tell the story, and yesterday
I needed quite some time to understand what the patchset
was trying to achieve.

>
> > I have to admit that I found the implementation quite complex and
> > in need of significant improvement.
>
> > However, I think the underlying> idea is very interesting and worth
> exploring further.
>
> No. What it is trying to achieve is ambitious, but the idea itself is
> not worth exploring further as-is unless the correctness and complexity
> concerns are addressed.

Can we give Tao more time to address the concerns and explain
the correctness of the approach?

That said, I don't think the patchset is entirely without merit.
The idea that caught my attention is whether knowing that a
process is guaranteed to be a leaf process could allow us to
simplify parts of the rmap machinery and reduce some of the
associated overhead.

Assuming that a fork server (e.g. systemd or zygote) is preferable
to having each application perform its own fork(), Linux already
largely relies on fork servers in practice. Matthew also pointed
out that calling fork() in multithreaded applications is a
terrible idea [1]. This may suggest that, in general, processes
outside of a fork-server model should avoid using fork().

If we were to introduce an API such as prctl(PR_SET_NOFORK) or
something similar, could we eliminate a significant portion of
the rmap-related overhead for such leaf processes, while still
avoiding the complexity of the lazy allocation scheme proposed
by Tao?

I assume that the vast majority of processes in a real system
are leaf processes?

It also seems somewhat unusual that a few Android applications
invoke fork() directly in a multithreaded context, while most
use the zygote to create multiple processes for an app. Perhaps
the Android framework should discourage this pattern entirely,
and require applications to create child processes via the zygote?

If, in real-world systems, more than 95% of processes are leaf
processes, could that imply that the rmap design might be
reconsidered for a different optimization path?

[1] https://marc.info/?l=linuxppc-embedded&m=177912107460825&w=2

>
> > I'm looking forward to seeing a v2 RFC with a cleaner and simpler
> > implementation while preserving the core concept.
>
> I'm afraid this encouragement would mislead us in the wrong direction,
> where all of us end up wasting time.
>
> There isn't much point in posting v2 without addressing fundamental
> questions about the design.

I suggested a v2 because the current patchset does not clearly
state what it is trying to achieve. A revised version might help
clarify the intent and make it easier to understand. Even if the
overall complexity (such as lazy allocation) makes it hard to
move forward, we may still be able to learn from it and gain
some useful inspiration.

>
> > Regardless of whether it ultimately gets merged, I hope the discussion
> > can continue.
>
> Regarding the "improving the reverse mapping subsystem" topic, a more
> constructive direction would be to carefully revisit the design
> decisions and discuss what we can do about them (that's exactly what
> Lorenzo has been doing).

I have no doubt at all about Lorenzo’s expertise in rmap and many
other mm areas. That is well understood and widely recognized.

I just think that hearing more perspectives could help us gain
additional insight and inspiration.

>
> But that's not the first thing I would recommend to a relatively new
> contributor given that it's really complicated and even the people who
> have designed and reworked the reverse mapping subsystem over the past
> 20+ years haven't come up with a fundamentally better design.
>
> Reverse mapping is a frustratingly complicated subsystem. Without
> carefully revisiting the current design, there is not much hope of
> improving things at the design level, even slightly.
>
> What I would recommend to new people instead is:
>
> 1) starting by reviewing other people's work, so that you have enough
> time to learn the historical context and subtleties of the subsystem
> without making intrusive changes (which also keeps in touch with the
> community), and
>
> 2) making progress on smaller tasks with less intrusive changes, to
> gradually build trust and be able to do more valuable work.
>

Yes, that is a good approach for new contributors.

> Unfortunately, looking at how this thread went, I see that the author is
> now in a worse position than an entirely new contributor.
>
> --
> Cheers,
> Harry / Hyeonggon

Thanks
Barry