Re: [PATCH v4 1/1] exec: seal system mappings
From: enh
Date: Thu Feb 06 2025 - 09:19:33 EST
On Thu, Jan 23, 2025 at 5:38 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
[heh, long time no see! haven't been on an email thread with you in a
while :-) ]
> On Thu, Jan 23, 2025 at 04:50:46PM -0500, enh wrote:
> > yeah, at this point i should (a) drag in +cferris who may have actual
> > experience of this and (b) admit that iirc i've never personally seen
> > _evidence_ of this, just claims. most famously in the chrome source...
> > if you `grep -r /proc/.*/maps` you'll find lots of examples, but
> > something like https://chromium.googlesource.com/chromium/src/+/main/base/debug/proc_maps_linux.h#61
> > is quite representative of the "folklore" in this area.
>
> That folklore is 100% based on a true story! I'm not sure that all of
> the details are precisely correct, but it's true enough that I wouldn't
> quibble with it.
>
> In fact, we want to make it worse. Because the mmap_lock is such a
> huge point of contention, we want to read /proc/PID/maps protected
> only by RCU. That will relax the guarantees to:
>
> a. If a VMA existed and was not modified during the duration of the
> read, it will definitely be returned.
> b. If a VMA was added during the call, it might be returned.
> c. If a VMA was removed during the call, it might be returned.
> d. If an address was covered by a VMA before the call and that
> VMA was modified during the call, you might get the prior or
> posterior state of the VMA. And you might get both!
>
> What might be confusing:
>
> e. If VMA A is added, then VMA B is added, your call might show you VMA
> B and not VMA A.
> f. Similarly for deleted.
> g. If you have, say, a VMA from (4000-9000) and you mprotect the region
> (5000-6000), you might see:
> 4000-9000 oldA
> or
> 4000-5000 newA
> 4000-9000 oldA
> or
> 4000-5000 newA
> 5000-6000 newB
> 4000-9000 oldA
> or
> 4000-5000 newA
> 5000-6000 newB
> 6000-9000 newC
>
> (it's possible other combinations might be visible; i'm not working on
> the details of this right now)
>
> We shouldn't be able to _skip_ a VMA. That seems far worse than
> returning duplicates; if your maps parser sees duplicates it can either
> try to figure it out itself, or retry the whole read.
yeah, fwiw i can't think i've seen a case where a duplicate would
matter --- half the code i've seen ["tell me more about the VMA
containing this address"] would just stop at the first match anyway
(though that's exactly the case where i'd rather have "direct access"
than have to search), and the other half ["give me a snapshot of all
the VMAs for offline debugging purposes"] doesn't really bother with
interpretation and leaves that up to humans.