Re: [GIT PULL] x86/mm for 6.2

From: Linus Torvalds
Date: Wed Dec 14 2022 - 17:36:31 EST


On Tue, Dec 13, 2022 at 9:43 AM Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> wrote:
>
> This also contains a new hardware feature: Linear Address Masking
> (LAM). It is similar conceptually to the ARM Top-Byte-Ignore (TBI)
> feature and should allow userspace memory sanitizers to be used
> with less overhead on x86.

Christ.

Is it too late to ask Intel to call this "Top-Bits-Ignore", and
instead of adding another crazy TLA, we'd just all agree to call this
"TBI"?

I know, I know, NIH and all that, but at least as long as we are
limiting ourselves to regular US-ASCII, we really only have 17576
TLA's to go around, and at some point it gets not only confusing, but
really quite wasteful, to have everybody make up their own
architecture-specific TLA.

And while I'm on the subject: I really think that the changes to
"untagged_addr()" are fundamentally broken.

Why? That whole LAM (or BTI) is not necessarily per-mm. It can easily
be per-*thread*.

Imagine, if you will, a setup where you have some threads that use
tagged pointers, and some threads that don't.

For example, maybe the upper bits of the address contains a tag that
is used only used within a virtual machine? You could even have the
"native" mode use the full address space, and put itself and its
private data in the upper bits virtually.

IOW, imagine using the virtual address masking as not just memory
sanitizers, but as an actual honest-to-goodness separation feature (eg
JITed code might fundamentally have access only to the lower bits,
while the JITter itself sees the whole address space).

Maybe that's not how LAM works on x86, but your changes to
untagged_addr() are *not* x86-specific.

So I really think this is completely wrong, quite aside from the
naming. It just makes assumptions that aren't valid.

The fact that you made this mm-specific actually ends up being an
active bug in the code, even on x86-64. You use the mmap lock to
serialize this all in prctl_enable_tagged_addr(), but then the read
side (ie just untagged_addr()) isn't actually serialized at all - and
*shouldn't* be serialized.

So I really think this is a fundamental design mistake, and while I
pulled it and sorted out the trivial conflicts, I've unpulled it again
as being actively mis-designed.

Linus