Re: [REGRESSION] x86/entry: Tracer no longer has opportunity to change the syscall number at entry via orig_ax

From: Michael Ellerman
Date: Sun Sep 13 2020 - 03:44:20 EST


Kees Cook <keescook@xxxxxxxxxxxx> writes:
> On Wed, Sep 09, 2020 at 11:53:42PM +1000, Michael Ellerman wrote:
>> I can observe the difference between v5.8 and mainline, using the
>> raw_syscall trace event and running the seccomp_bpf selftest which turns
>> a getpid (39) into a getppid (110).
>>
>> With v5.8 we see getppid on entry and exit:
>>
>> seccomp_bpf-1307 [000] .... 22974.874393: sys_enter: NR 110 (7ffff22c46e0, 40a350, 4, fffffffffffff7ab, 7fa6ee0d4010, 0)
>> seccomp_bpf-1307 [000] .N.. 22974.874401: sys_exit: NR 110 = 1304
>>
>> Whereas on mainline we see an enter for getpid and an exit for getppid:
>>
>> seccomp_bpf-1030 [000] .... 21.806766: sys_enter: NR 39 (7ffe2f6d1ad0, 40a350, 7ffe2f6d1ad0, 0, 0, 407299)
>> seccomp_bpf-1030 [000] .... 21.806767: sys_exit: NR 110 = 1027
>
> For my own notes, this is how I reproduced it:
>
> # ./perf-$VER record -e raw_syscalls:sys_enter -e raw_syscalls:sys_exit &
> # ./seccomp_bpf
> # fg
> ctrl-c
> # ./perf-$VER script | grep seccomp_bpf | awk '{print $7}' | sort | uniq -c > $VER.log
> *repeat*
> # diff -u old.log new.log
> ...
>
> (Is there an easier way to get those results?)

I did more or less the same thing, except I ran the trace event manually
(via debugfs), which is no better really.

I think the right way to test it would be to have a test that modifies
the syscall via seccomp and also monitors the trace event using perf
events. But that wouldn't be easier :)

> I will go see if I can figure out the best way to correct this.

I think this works?

diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 18683598edbc..901361e2f8ea 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -60,13 +60,15 @@ static long syscall_trace_enter(struct pt_regs *regs, long syscall,
return ret;
}

+ syscall = syscall_get_nr(current, regs);
+
if (unlikely(ti_work & _TIF_SYSCALL_TRACEPOINT))
trace_sys_enter(regs, syscall);

syscall_enter_audit(regs, syscall);

/* The above might have changed the syscall number */
- return ret ? : syscall_get_nr(current, regs);
+ return ret ? : syscall;
}

static __always_inline long


cheers