Re: [PATCH 1/3] selftests/mm: virtual_address_range: Fix error when CommitLimit < 1GiB

From: Dev Jain
Date: Thu Jan 09 2025 - 00:40:42 EST



On 08/01/25 9:43 pm, Thomas Weißschuh wrote:
On Wed, Jan 08, 2025 at 02:36:57PM +0100, David Hildenbrand wrote:
On 08.01.25 09:05, Thomas Weißschuh wrote:
On Wed, Jan 08, 2025 at 11:46:19AM +0530, Dev Jain wrote:
On 07/01/25 8:44 pm, Thomas Weißschuh wrote:
If not enough physical memory is available the kernel may fail mmap();
see __vm_enough_memory() and vm_commit_limit().
In that case the logic in validate_complete_va_space() does not make
sense and will even incorrectly fail.
Instead skip the test if no mmap() succeeded.

Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()")
Cc: stable@xxxxxxxxxxxxxxx
CC stable on tests is ... odd.
I thought it was fairly common, but it isn't.
Will drop it.

Oh, well...
https://lore.kernel.org/all/20240521074358.675031-4-dev.jain@xxxxxxx/
I have done that before :) although the change I was making was fixing a
fundamental flaw in the test and your change is fixing the test for a
specific case (memory pressure), so I tend to concur with David.


Signed-off-by: Thomas Weißschuh <thomas.weissschuh@xxxxxxxxxxxxx>

---
The logic in __vm_enough_memory() seems weird.
It describes itself as "Check that a process has enough memory to
allocate a new virtual mapping", however it never checks the current
memory usage of the process.
So it only disallows large mappings. But many small mappings taking the
same amount of memory are allowed; and then even automatically merged
into one big mapping.
---
tools/testing/selftests/mm/virtual_address_range.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/tools/testing/selftests/mm/virtual_address_range.c b/tools/testing/selftests/mm/virtual_address_range.c
index 2a2b69e91950a37999f606847c9c8328d79890c2..d7bf8094d8bcd4bc96e2db4dc3fcb41968def859 100644
--- a/tools/testing/selftests/mm/virtual_address_range.c
+++ b/tools/testing/selftests/mm/virtual_address_range.c
@@ -178,6 +178,12 @@ int main(int argc, char *argv[])
validate_addr(ptr[i], 0);
}
lchunks = i;
+
+ if (!lchunks) {
+ ksft_test_result_skip("Not enough memory for a single chunk\n");
+ ksft_finished();
+ }
+
hptr = (char **) calloc(NR_CHUNKS_HIGH, sizeof(char *));
if (hptr == NULL) {
ksft_test_result_skip("Memory constraint not fulfilled\n");

I do not know about __vm_enough_memory(), but I am going by your description:
You say that the kernel may fail mmap() when enough physical memory is not
there, but it may happen that we have already done 100 mmap()'s, and then
the kernel fails mmap(), so if (!lchunks) won't be able to handle this case.
Basically, lchunks == 0 is not a complete indicator of kernel failing mmap().
__vm_enough_memory() only checks the size of each single mmap() on its
own. It does not actually check the current memory or address space
usage of the process.
This seems a bit weird, as indicated in my after-the-fold explanation.

The basic assumption of the test is that any process should be able to exhaust
its virtual address space, and running the test under memory pressure and the
kernel violating this behaviour defeats the point of the test I think?
The assumption is correct, as soon as one mapping succeeds the others
will also succeed, until the actual address space is exhausted.

Looking at it again, __vm_enough_memory() is only called for writable
mappings, so it would be possible to use only readable mappings in the
test. The test will still fail with OOM, as the many PTEs need more than
1GiB of physical memory anyways, but at least that produces a usable
error message.
However I'm not sure if this would violate other test assumptions.

Note that with MAP_NORESRVE, most setups we care about will allow mapping as
much as you want, but on access OOM will fire.
Thanks for the hint.

So one could require that /proc/sys/vm/overcommit_memory is setup properly
and use MAP_NORESRVE.
Isn't the check for lchunks == 0 essentially exactly this?

Reading from anonymous memory will populate the shared zeropage. To mitigate
OOM from "too many page tables", one could simply unmap the pieces as they
are verified (or MAP_FIXED over them, to free page tables).
The code has to figure out if a verified region was created by mmap(),
otherwise an munmap() could crash the process.
As the entries from /proc/self/maps may have been merged and (I assume)
the ordering of mappings is not guaranteed, some bespoke logic to establish
the link will be needed.

Is it fine to rely on CONFIG_ANON_VMA_NAME?
That would make it much easier to implement.

Using MAP_NORESERVE and eager munmap()s, the testcase works nicely even
in very low physical memory conditions.

Thomas