Re: [PATCH 2/2] selftests/mm: compaction_test: Fix trivial test success and reduce probability of OOM-killer invocation

From: Andrew Morton
Date: Sun May 19 2024 - 20:04:07 EST


On Wed, 15 May 2024 15:06:33 +0530 Dev Jain <dev.jain@xxxxxxx> wrote:

> Reset nr_hugepages to zero before the start of the test.
>
> If a non-zero number of hugepages is already set before the start of the
> test, the following problems arise:
>
> - The probability of the test getting OOM-killed increases.
> Proof: The test wants to run on 80% of available memory to prevent
> OOM-killing (see original code comments). Let the value of mem_free at the
> start of the test, when nr_hugepages = 0, be x. In the other case, when
> nr_hugepages > 0, let the memory consumed by hugepages be y. In the former
> case, the test operates on 0.8 * x of memory. In the latter, the test
> operates on 0.8 * (x - y) of memory, with y already filled, hence, memory
> consumed is y + 0.8 * (x - y) = 0.8 * x + 0.2 * y > 0.8 * x. Q.E.D
>
> - The probability of a bogus test success increases.
> Proof: Let the memory consumed by hugepages be greater than 25% of x,
> with x and y defined as above. The definition of compaction_index is
> c_index = (x - y)/z where z is the memory consumed by hugepages after
> trying to increase them again. In check_compaction(), we set the number
> of hugepages to zero, and then increase them back; the probability that
> they will be set back to consume at least y amount of memory again is
> very high (since there is not much delay between the two attempts of
> changing nr_hugepages). Hence, z >= y > (x/4) (by the 25% assumption).
> Therefore,
> c_index = (x - y)/z <= (x - y)/y = x/y - 1 < 4 - 1 = 3
> hence, c_index can always be forced to be less than 3, thereby the test
> succeeding always. Q.E.D
>
> NOTE: This patch depends on the previous one.
>
> -int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
> +int check_compaction(unsigned long mem_free, unsigned int hugepage_size,
> + int initial_nr_hugepages)
> {
> int fd, ret = -1;
> int compaction_index = 0;
> - char initial_nr_hugepages[10] = {0};
> char nr_hugepages[10] = {0};
> + char init_nr_hugepages[10] = {0};
> +
> + sprintf(init_nr_hugepages, "%d", initial_nr_hugepages);

Well, [10] isn't really large enough. "-1111111111" requires 12 chars,
with the trailing \0. And I'd suggest an unsigned type and a %u -
negative initial_nr_hugepages doesn't make a lot of sense.

>
> +int set_zero_hugepages(int *initial_nr_hugepages)
> +{
> + int fd, ret = -1;
> + char nr_hugepages[10] = {0};

Ditto?

> + fd = open("/proc/sys/vm/nr_hugepages", O_RDWR | O_NONBLOCK);
> + if (fd < 0) {
> + ksft_print_msg("Failed to open /proc/sys/vm/nr_hugepages: %s\n",
> + strerror(errno));
> + goto out;
> + }
> +
> + if (read(fd, nr_hugepages, sizeof(nr_hugepages)) <= 0) {
> + ksft_print_msg("Failed to read from /proc/sys/vm/nr_hugepages: %s\n",
> + strerror(errno));
> + goto close_fd;
> + }
> +
> + lseek(fd, 0, SEEK_SET);
> +
> + /* Start with the initial condition of 0 huge pages */
> + if (write(fd, "0", sizeof(char)) != sizeof(char)) {
> + ksft_print_msg("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n",
> + strerror(errno));
> + goto close_fd;
> + }
> +
> + *initial_nr_hugepages = atoi(nr_hugepages);
> + ret = 0;
> +
> + close_fd:
> + close(fd);
> +
> + out:
> + return ret;
> +}