Re: [PATCH v8 14/21] mm/mmap: Avoid zeroing vma tree in mmap_region()

From: Lorenzo Stoakes
Date: Mon Oct 14 2024 - 05:47:12 EST


On Mon, Oct 14, 2024 at 12:35:59AM +0200, Bert Karwatzki wrote:
> I created a program which can trigger the bug on newer kernel (after the
> "Avoid zeroing vma tree in mmap_region()" patch and before the fix).
> My original goal was to trigger the bug on older kernels,
> but that does not work, yet.
>
> Bert Karwatzki

Thanks, that's great!

For older kernels the problem should still be present, the fundamental
thing that changed from the point of view of this bug is that merge won't
contribute to the number of VMAs being overwritten at once.

To trigger prior to commit f8d112a4e657 ("mm/mmap: avoid zeroing vma tree
in mmap_region()") you would need to create a situation where the _clear_
triggers the bug, i.e. you must consistute all the VMAs that are being
overwritten by the store from existing VMAs you are overwriting with a
MAP_FIXED.

So some tweaks should get you there...

>
> #define _GNU_SOURCE
> #include <stdlib.h>
> #include <stdio.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <errno.h>
> #include <sys/mman.h>
>
> int main()
> {
> int ret, prot;
> void *addr, *tmp = NULL;
>
> // Create a lot of consecutive mappings to create a sufficiently deep maple tree
> for (int i = 0; i < 224; i++) {
> // We're creating mappings with different PROT_ to
> // avoid the vmas getting merged.
> if (i % 2)
> prot = PROT_READ;
> else
> prot = PROT_WRITE;
>
> // These mappings are all at very low addresses in the virtual address space so
> // they are mapped before the text and data sections of the executable and
> // the library and stack mappings
> tmp = mmap(tmp + 0x100000, 0x100000, prot, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
> }
>
> //
> // The maple node we're targetting has the range 0x7800000-0x86fffff (and 15 entries of size 0x100000 each)
> //
> // Here is the layout of the tree before the spanning store:
> //
> // [0 - ffffffffffffffff]
> // / \
> // / \
> // [0-86fffff] [8700000-ffffffffffffffff]
> // / | \ / |
> // / | \ / |
> // ... [6900000- [7800000- [8700000- ...
> // 77fffff] 86fffff] 87fffff]
> //
> // Do we always need a spanning_store AND a merge? Yes, and we must be carefull that we do not merge
> // with the first vma of the next node.
> //
> // This gives a spanning_store because the newly created mapping can be merge with
> // with the last mapping (0x7700000-0x77fffff) in the previous node as both have PROT_WRITE.
> // No corruption here! Why? This merges with the next node, too! (0x8700000-0x87fffff is PROT_WRITE, too)
> //addr = mmap((void *) 0x7800000, 0x1000000 - 0x100000, PROT_WRITE, MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
>
> // This give a spanning_store, but no merge as the PROT_ flags do not fit, no maple tree corruption here!
> //addr = mmap((void *) 0x7700000, 0x1000000, PROT_NONE, MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
>
> // this give a spanning store, but no merge, no corruption here!
> //addr = mmap((void *) 0x7700000, 0x1000000, PROT_WRITE, MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
>
> // This last example give the maple tree corruption and the validate_mm() error:
>
> // The mapping from 0x7600000 to 0x7700000 has PROT_READ, so this gives the needed merge
> addr = mmap((void *) 0x7700000, 0x1000000, PROT_READ, MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
>
> // Just for waiting (to examine the mappings in /proc/PID/maps)
> for (;;) {
> }
>
> return 0;
> }
>
>
>