Re: [BUG] Random hard lockup with userspace %ip on 7.0-rc5

From: Calvin Owens

Date: Tue Mar 31 2026 - 21:58:40 EST


On Wednesday 03/25 at 17:56 +0100, Thomas Gleixner wrote:
> On Wed, Mar 25 2026 at 10:03, Petr Mladek wrote:
> > Adding scheduler and NOHZ people into Cc. Hardlockup in userspace with
> > interrupts enabled is strange indeed. Plus, there is some nohz-related
> > function on stack...
>
> That's a leftover from a previous syscall and ixgbe_poll is from the
> most recent soft interrupt handling. Nothing to see here.
>
> > On Tue 2026-03-24 16:32:09, Calvin Owens wrote:
> >> Hi all,
> >>
> >> I hit the below hard lockup while building kernels a couple days ago. I
> >> haven't been able to reproduce it again.
> >>
> >> It's clearly not actually a hard lockup, the NMI fired while %ip was in
> >> userspace with irqs enabled:
>
> Yeah. Doesn't make any sense at all. Does not ring any bell.

I got it!

The below userspace reproducer consistently triggers the hard lockup
on two different machines with an AMD 7950X3D and an AMD 9950X3D CPU.

However, it never reproduces at all on a Xeon E-2124. Maybe a clue?

I wish I had a nice clever story for how I found it, but I just guessed
based on how systemd uses timerfd_settime().

Can anybody else reproduce with this?

I'll investigate more tomorrow, but I need to get some sleep first :)

Thanks,
Calvin

---
#include <unistd.h>
#include <err.h>
#include <pthread.h>
#include <sys/timerfd.h>

#ifndef NR_THREADS
#define NR_THREADS 32
#endif

static void set(int fd)
{
struct itimerspec new = {
.it_value = {
.tv_sec = 0,
.tv_nsec = 1,
},
};

if (timerfd_settime(fd, TFD_TIMER_ABSTIME | TFD_TIMER_CANCEL_ON_SET,
&new, NULL))
err(2, "Can't set timer");
}

static void *fn(void *arg)
{
int fd = timerfd_create(CLOCK_MONOTONIC, 0);

while (1)
set(fd);

close(fd);
return NULL;
}

int main(int argc, char **argv)
{
pthread_t threads[NR_THREADS];
int i;

for (i = 0; i < NR_THREADS; i++)
if (pthread_create(&threads[i], NULL, fn, NULL))
errx(1, "Can't make thread %d", i);

for (i = 0; i < NR_THREADS; i++)
if (pthread_join(threads[i], NULL))
errx(1, "Can't join thread %d", i);

return 0;
}