Re: [RFC PATCH 01/22] kernel/entry/common: Move syscall_enter_from_user_mode_work() out of header
From: Peter Zijlstra
Date: Thu Feb 20 2025 - 05:47:16 EST
On Thu, Feb 20, 2025 at 09:32:36AM +0000, K Prateek Nayak wrote:
> Retain the prototype of syscall_enter_from_user_mode_work() in
> linux/entry-common.h and move the function definition to
> kernel/entry/common.c in preparation to notify the scheduler of task
> entering and exiting kernel mode for syscall. The two architectures that
> use it directly (x86, s390) and the four that call it via
> syscall_enter_from_user_mode() (x86, riscv, loongarch, s390) end up
> selecting GENERIC_ENTRY, hence, no functional changes are intended.
>
> syscall_enter_from_user_mode_work() will end up calling function whose
> visibility needs to be limited for kernel/* use only for cfs throttling
> deferral.
>
> Signed-off-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
> ---
> include/linux/entry-common.h | 10 +---------
> kernel/entry/common.c | 10 ++++++++++
> 2 files changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
> index fc61d0205c97..7569a49cf7a0 100644
> --- a/include/linux/entry-common.h
> +++ b/include/linux/entry-common.h
> @@ -161,15 +161,7 @@ long syscall_trace_enter(struct pt_regs *regs, long syscall,
> * ptrace_report_syscall_entry(), __secure_computing(), trace_sys_enter()
> * 2) Invocation of audit_syscall_entry()
> */
> -static __always_inline long syscall_enter_from_user_mode_work(struct pt_regs *regs, long syscall)
> -{
> - unsigned long work = READ_ONCE(current_thread_info()->syscall_work);
> -
> - if (work & SYSCALL_WORK_ENTER)
> - syscall = syscall_trace_enter(regs, syscall, work);
> -
> - return syscall;
> -}
> +long syscall_enter_from_user_mode_work(struct pt_regs *regs, long syscall);
>
> /**
> * syscall_enter_from_user_mode - Establish state and check and handle work
> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> index e33691d5adf7..cc93cdcc36d0 100644
> --- a/kernel/entry/common.c
> +++ b/kernel/entry/common.c
> @@ -79,6 +79,16 @@ noinstr void syscall_enter_from_user_mode_prepare(struct pt_regs *regs)
> instrumentation_end();
> }
>
> +__always_inline long syscall_enter_from_user_mode_work(struct pt_regs *regs, long syscall)
> +{
> + unsigned long work = READ_ONCE(current_thread_info()->syscall_work);
> +
> + if (work & SYSCALL_WORK_ENTER)
> + syscall = syscall_trace_enter(regs, syscall, work);
> +
> + return syscall;
> +}
> +
> /* Workaround to allow gradual conversion of architecture code */
> void __weak arch_do_signal_or_restart(struct pt_regs *regs) { }
This breaks s390. While you retain an external linkage, the function
looses the noinstr tag that's needed for correctness.
Also, extern __always_inline is flaky as heck. Please don't do this.