Re: [PATCH v4 1/2] kernel: Implement selective syscall userspace redirection

From: Mark Rutland
Date: Tue Jul 21 2020 - 08:06:53 EST


On Thu, Jul 16, 2020 at 09:48:50PM -0700, Andy Lutomirski wrote:
> On Thu, Jul 16, 2020 at 7:15 PM Gabriel Krisman Bertazi
> <krisman@xxxxxxxxxxxxx> wrote:
> >
> > Andy Lutomirski <luto@xxxxxxxxxx> writes:
> >
> > > On Thu, Jul 16, 2020 at 12:31 PM Gabriel Krisman Bertazi
> > > <krisman@xxxxxxxxxxxxx> wrote:
> > >>
> > >
> > > This is quite nice. I have a few comments, though:
> > >
> > > You mentioned rt_sigreturn(). Should this automatically exempt the
> > > kernel-provided signal restorer on architectures (e.g. x86_32) that
> > > provide one?
> >
> > That seems reasonable. Not sure how easy it is to do it, though.
>
> For better or for worse, it's currently straightforward because the code is:
>
> __kernel_sigreturn:
> .LSTART_sigreturn:
> popl %eax /* XXX does this mean it needs unwind info? */
> movl $__NR_sigreturn, %eax
> SYSCALL_ENTER_KERNEL
>
> and SYSCALL_ENTER_KERNEL is hardwired as int $0x80. (The latter is
> probably my fault, for better or for worse.) So this would change to:
>
> __vdso32_sigreturn_syscall:
> SYSCALL_ENTER_KERNEL
>
> and vdso2c would wire up __vdso32_sigreturn_syscall. Then there would
> be something like:
>
> bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs);
>
> and that would be that. Does anyone have an opinion as to whether
> this is a good idea? Modern glibc shouldn't be using this mechanism,
> I think, but I won't swear to it.

On arm64 sigreturn is always through the vdso, so IIUC we'd certainly
need something like this. Otherwise it'd be the user's responsibility to
register the vdso sigtramp range when making the prctl, and flip the
selector in each signal handler, which sounds both painful and fragile.

Mark.