vdso-related userspace crashes on 5.5 mips64

From: Jason A. Donenfeld
Date: Mon Dec 23 2019 - 08:08:51 EST


Hi,

I'm experiencing VDSO-related crashes on 5.5 with MIPS64. The MIPS64
builders on build.wireguard.com are all red at the moment.

It looks like libc is crashing with a null pointer dereference when
doing any work after returning from clock_gettime. This manifests
itself, for me, with calls to clock_gettime(CLOCK_PROCESS_CPUTIME_ID),
because CLOCK_PROCESS_CPUTIME_ID is not in the VDSO. It looks in the
VDSO, doesn't find it, and then proceeds to make the real syscall, when
it crashes. I can simulate the same crash by simply adding a printf
after a successfulÂcall to the vdso before returning. For example:

int __clock_gettime(clockid_t clk, struct timespec *ts)
{
int r;

#ifdef VDSO_CGT_SYM
int (*f)(clockid_t, struct timespec *) =
(int (*)(clockid_t, struct timespec *))vdso_func;
printf("vdso %p\n", f); // <-- this line does NOT crash.
if (f) {
r = f(clk, ts);
if (!r) {
printf("ret %d\n", r); // <-- this line DOES crash.
return r;
}
if (r == -EINVAL)
return __syscall_ret(r);
}
#endif
printf("falling through\n"); // <--- this line DOES crash.
r = __syscall(SYS_clock_gettime, clk, ts); // <-- also, this line will crash too
if (r == -ENOSYS) {
if (clk == CLOCK_REALTIME) {
__syscall(SYS_gettimeofday, ts, 0);
ts->tv_nsec = (int)ts->tv_nsec * 1000;
return 0;
}
r = -EINVAL;
}
return __syscall_ret(r);
}

It seems like somehow the stack frame is corrupted/unusable after a call
to the vdso. But, returning immediately from clock_gettime after a call
to the vdso allows the program to continue. Thus, this problem only
manifests itself when using clocks that aren't handled by the vdso.

It's possible this is due to some compiler ABI mismatch situation
between userspace and kernelspace. However, I've only started seeing
this happen with 5.5 and not on 5.4.

Does the above description immediately point to some recognizable
change? If not, I'll keep debugging.

Thanks,
Jason