Re: [PATCH 2/2] selftests/mm: compaction_test: Fix trivial test success and reduce probability of OOM-killer invocation

From: Dev Jain
Date: Mon May 20 2024 - 01:33:56 EST



On 5/20/24 05:33, Andrew Morton wrote:
On Wed, 15 May 2024 15:06:33 +0530 Dev Jain <dev.jain@xxxxxxx> wrote:

Reset nr_hugepages to zero before the start of the test.

If a non-zero number of hugepages is already set before the start of the
test, the following problems arise:

- The probability of the test getting OOM-killed increases.
Proof: The test wants to run on 80% of available memory to prevent
OOM-killing (see original code comments). Let the value of mem_free at the
start of the test, when nr_hugepages = 0, be x. In the other case, when
nr_hugepages > 0, let the memory consumed by hugepages be y. In the former
case, the test operates on 0.8 * x of memory. In the latter, the test
operates on 0.8 * (x - y) of memory, with y already filled, hence, memory
consumed is y + 0.8 * (x - y) = 0.8 * x + 0.2 * y > 0.8 * x. Q.E.D

- The probability of a bogus test success increases.
Proof: Let the memory consumed by hugepages be greater than 25% of x,
with x and y defined as above. The definition of compaction_index is
c_index = (x - y)/z where z is the memory consumed by hugepages after
trying to increase them again. In check_compaction(), we set the number
of hugepages to zero, and then increase them back; the probability that
they will be set back to consume at least y amount of memory again is
very high (since there is not much delay between the two attempts of
changing nr_hugepages). Hence, z >= y > (x/4) (by the 25% assumption).
Therefore,
c_index = (x - y)/z <= (x - y)/y = x/y - 1 < 4 - 1 = 3
hence, c_index can always be forced to be less than 3, thereby the test
succeeding always. Q.E.D

NOTE: This patch depends on the previous one.

-int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
+int check_compaction(unsigned long mem_free, unsigned int hugepage_size,
+ int initial_nr_hugepages)
{
int fd, ret = -1;
int compaction_index = 0;
- char initial_nr_hugepages[10] = {0};
char nr_hugepages[10] = {0};
+ char init_nr_hugepages[10] = {0};
+
+ sprintf(init_nr_hugepages, "%d", initial_nr_hugepages);
Well, [10] isn't really large enough. "-1111111111" requires 12 chars,
with the trailing \0. And I'd suggest an unsigned type and a %u -
negative initial_nr_hugepages doesn't make a lot of sense.

+int set_zero_hugepages(int *initial_nr_hugepages)
+{
+ int fd, ret = -1;
+ char nr_hugepages[10] = {0};
Ditto?


Sure, makes sense. I'll just change that to 20 and make it unsigned.


+ fd = open("/proc/sys/vm/nr_hugepages", O_RDWR | O_NONBLOCK);
+ if (fd < 0) {
+ ksft_print_msg("Failed to open /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
+ goto out;
+ }
+
+ if (read(fd, nr_hugepages, sizeof(nr_hugepages)) <= 0) {
+ ksft_print_msg("Failed to read from /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
+ goto close_fd;
+ }
+
+ lseek(fd, 0, SEEK_SET);
+
+ /* Start with the initial condition of 0 huge pages */
+ if (write(fd, "0", sizeof(char)) != sizeof(char)) {
+ ksft_print_msg("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
+ goto close_fd;
+ }
+
+ *initial_nr_hugepages = atoi(nr_hugepages);
+ ret = 0;
+
+ close_fd:
+ close(fd);
+
+ out:
+ return ret;
+}