Re: [linus:master] [sched/core] 704069649b: kernel-selftests.kvm.hardware_disable_test.fail

From: Sean Christopherson

Date: Fri Feb 13 2026 - 17:14:26 EST


On Fri, Feb 13, 2026, Peter Zijlstra wrote:
> On Thu, Feb 12, 2026 at 10:08:04PM +0800, kernel test robot wrote:
> > Hello,
> >
> > we found the kernel-selftests.kvm.hardware_disable_test failed consistently upon
> > this commit but pass on parent. unfortunately, we didn't find many useful
> > information in dmesg. this report is just FYI what we observed in our tests.
> >
> > kernel test robot noticed "kernel-selftests.kvm.hardware_disable_test.fail" on:
>
> With the caveat of PEBKAC (it is Friday after all); I can't reproduce.
>
> That is, ./hardware_disable_test as build from cee73b1e840c, doesn't
> work for me on 704069649b5b^1 either.
>
> Sean; is there a magic trick to operating that test, or is it a known
> trouble spot?

Hmm, shouldn't require any magic, and hasn't been known to be flaky.

This very decisively points at 704069649b5b ("sched/core: Rework
sched_class::wakeup_preempt() and rq_modified_*()"). on my end as well. With
that commit reverted, the below runs in ~40ms total. With 704069649b5b present,
the test constantly stalls for multiple seconds at sem_timedwait().

AFAICT, the key is to have the busy_loop() pthread affined to the same CPU as
its parent. The KVM pieces of the selftest have nothing to do with the failure.

Here's a minimal reproducer that you can build without selftests goo :-)
E.g. `gcc -pthread -o busy busy.c` should work.

// SPDX-License-Identifier: GPL-2.0-only
#define _GNU_SOURCE

#include <fcntl.h>
#include <pthread.h>
#include <sched.h>
#include <semaphore.h>
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/wait.h>
#include <unistd.h>

sem_t *sem;

static void *busy_loop(void *arg)
{
for (;;)
;

return NULL;
}

static void run_test(uint32_t run)
{
pthread_t thread;
cpu_set_t cpuset;

CPU_ZERO(&cpuset);
CPU_SET(sched_getcpu(), &cpuset);

printf("%s: [%d] spawn busy thread\n", __func__, run);
if (pthread_create(&thread, NULL, busy_loop, (void *)NULL))
exit(-1);

if (pthread_setaffinity_np(thread, sizeof(cpuset), &cpuset))
exit(-1);

printf("%s: [%d] thread launched\n", __func__, run);
sem_post(sem);

pthread_join(thread, NULL);
printf("child pthread exited prematurely\n");
exit(-1);
}

void wait_for_child_setup(pid_t pid)
{
/*
* Wait for the child to post to the semaphore, but wake up periodically
* to check if the child exited prematurely.
*/
for (;;) {
const struct timespec wait_period = { .tv_sec = 1 };
int status;

if (!sem_timedwait(sem, &wait_period))
return;

/* Child is still running, keep waiting. */
if (pid != waitpid(pid, &status, WNOHANG))
continue;

/*
* Child is no longer running, which is not expected.
*
* If it exited with a non-zero status, we explicitly forward
* the child's status in case it exited with KSFT_SKIP.
*/
if (WIFEXITED(status))
exit(WEXITSTATUS(status));

printf("Child exited unexpectedly\n");
exit(-1);
}
}

int main(int argc, char **argv)
{
uint32_t i;
int s, r;
pid_t pid;

sem = sem_open("vm_sem", O_CREAT | O_EXCL, 0644, 0);
sem_unlink("vm_sem");

for (i = 0; i < 512; ++i) {
pid = fork();
if (pid < 0)
exit(-1);
if (pid == 0)
run_test(i); /* This function always exits */

printf("%s: [%d] waiting semaphore\n", __func__, i);
wait_for_child_setup(pid);

printf("%s: [%d] do waitpid\n", __func__, i);
r = waitpid(pid, &s, WNOHANG);
if (r == pid) {
printf("%s: [%d] child exited unexpectedly status: [%d]",
__func__, i, s);
exit(-1);
}
printf("%s: [%d] killing child\n", __func__, i);
kill(pid, SIGKILL);
}

sem_destroy(sem);
exit(0);
}