Re: [PATCH] mm: Always sanity check anon_vma first for per-vma locks

From: Peter Xu
Date: Wed Apr 10 2024 - 16:44:07 EST


On Wed, Apr 10, 2024 at 09:26:45PM +0100, Matthew Wilcox wrote:
> On Wed, Apr 10, 2024 at 01:06:21PM -0400, Peter Xu wrote:
> > anon_vma is a tricky object in the context of per-vma lock, because it's
> > racy to modify it in that context and mmap lock is needed if it's not
> > stable yet.
>
> I object to this commit message. First, it's not a "sanity check". It's
> a check to see if we already have an anon VMA. Second, it's not "racy
> to modify it" at all. The problem is that we need to look at other
> VMAs, for which we do not hold the lock.

For that "do not hold locks" part, isn't that "racy"?

When it's racy in that case, can I still word it as "racy to modify"? We
can't modify it because it's racy to read the other vmas.

For "sanity check".. well, that falls into this category for me but I'm not
a native speaker. So I am open to any rewords for any of above.

>
> > So the trivial side effect of such patch is:
> >
> > - We may do slightly better on the first WRITE of a private file mapping,
> > because we can retry earlier (in lock_vma_under_rcu(), rather than
> > vmf_anon_prepare() later).
> >
> > - We may always use mmap lock for the initial READs on a private file
> > mappings, while before this patch it _can_ (only when no WRITE ever
> > happened... but it doesn't make much sense for a MAP_PRIVATE..) do the
> > read fault with per-vma lock.
>
> But that's a super common path! Look at 'cat /proc/self/maps'. All
> your program text (including libraries) is mapped PRIVATE, and never
> written to (except by ptrace, I guess).
>
> NAK this patch.

We're talking about any vma that will first benefit from a per-vma lock
here, right?

I think it should be only relevant to some major VMA or bunch of VMAs that
an userspace maps explicitly, then iiuc the goal is we want to reduce the
cache bouncing of the lock when it used to be per-mm, by replacing it with
a finer lock. It doesn't sound right that these libraries even fall into
this category as they should just get loaded soon enough when the program
starts.

IOW, my understanding is that per-vma lock doesn't benefit from such normal
vmas or simple programs that much; we take either per-vma read lock, or
mmap read lock, and I would expect similar performance when such cache
bouncing isn't heavy.

I can do some tests later today or tomorrow. Any suggestion you have on
amplifying such effect that you have concern with?

--
Peter Xu