Re: vdso-related userspace crashes on 5.5 mips64

From: Paul Burton
Date: Mon Dec 23 2019 - 18:28:01 EST


Hi Jason,

Copying Vincenzo.

On Mon, Dec 23, 2019 at 02:08:34PM +0100, Jason A. Donenfeld wrote:
> I'm experiencing VDSO-related crashes on 5.5 with MIPS64. The MIPS64
> builders on build.wireguard.com are all red at the moment.
>
> It looks like libc is crashing with a null pointer dereference when
> doing any work after returning from clock_gettime. This manifests
> itself, for me, with calls to clock_gettime(CLOCK_PROCESS_CPUTIME_ID),
> because CLOCK_PROCESS_CPUTIME_ID is not in the VDSO. It looks in the
> VDSO, doesn't find it, and then proceeds to make the real syscall, when
> it crashes. I can simulate the same crash by simply adding a printf
> after a successfulÂcall to the vdso before returning. For example:
>
> int __clock_gettime(clockid_t clk, struct timespec *ts)
> {
> int r;
>
> #ifdef VDSO_CGT_SYM
> int (*f)(clockid_t, struct timespec *) =
> (int (*)(clockid_t, struct timespec *))vdso_func;
> printf("vdso %p\n", f); // <-- this line does NOT crash.
> if (f) {
> r = f(clk, ts);
> if (!r) {
> printf("ret %d\n", r); // <-- this line DOES crash.
> return r;
> }
> if (r == -EINVAL)
> return __syscall_ret(r);
> }
> #endif
> printf("falling through\n"); // <--- this line DOES crash.
> r = __syscall(SYS_clock_gettime, clk, ts); // <-- also, this line will crash too
> if (r == -ENOSYS) {
> if (clk == CLOCK_REALTIME) {
> __syscall(SYS_gettimeofday, ts, 0);
> ts->tv_nsec = (int)ts->tv_nsec * 1000;
> return 0;
> }
> r = -EINVAL;
> }
> return __syscall_ret(r);
> }
>
> It seems like somehow the stack frame is corrupted/unusable after a call
> to the vdso. But, returning immediately from clock_gettime after a call
> to the vdso allows the program to continue. Thus, this problem only
> manifests itself when using clocks that aren't handled by the vdso.
>
> It's possible this is due to some compiler ABI mismatch situation
> between userspace and kernelspace. However, I've only started seeing
> this happen with 5.5 and not on 5.4.
>
> Does the above description immediately point to some recognizable
> change? If not, I'll keep debugging.

There is one pending fix for the VDSO in mips-fixes, commit 7d2aa4bb90f5
("mips: Fix gettimeofday() in the vdso library") but your symptoms sound
different to the problem fixed there...

Could you share your kernel config & tell us which platform you're
running on? (QEMU Malta?)

Thanks,
Paul