Re: [PATCH RFC] seccomp: Implement syscall isolation based on memory areas
From: Andy Lutomirski
Date: Sun May 31 2020 - 14:10:23 EST
On Sun, May 31, 2020 at 5:56 AM Paul Gofman <gofmanp@xxxxxxxxx> wrote:
>
> On 5/31/20 03:59, Andy Lutomirski wrote:
> >
> > Iâm suggesting that the kernel learn how to help you, maybe like this:
> >
> > prctl(PR_SET_SYSCALL_THUNK, target, address_of_unredirected_syscall, 0, 0, 0, 0);
> >
> > This would be inherited on clone/fork and cleared on execve.
> >
> If we are talking about explicit specification of syscall addresses to
> be trapped by Wine, the problem here is that we don't have any way of
> knowing the exact addresses of syscalls to be redirected. We would need
> some way to find those syscalls in the highly obfuscated dynamically
> generated code, the whole purpose of which is to prevent disassembling,
> debugging and finding things like that in it. What we do know is that if
> a syscall is executed from any memory which Wine allocates for Windows
> application then it should be treated as Windows syscall and routed to
> the Wine's dispatch function. Those code areas can be dynamically
> allocated and deallocated.
That's not what I meant. I meant that you would set the kernel up to
redirect *all* syscalls from the thread with the sole exception of one
syscall instruction in the thunk. This would catch Windows syscalls
and Linux syscalls. The thunk would determine whether the original
syscall was Linux or Windows and handle it accordingly.
This may interact poorly with the DRM scheme. The redzone might need
to be respected, or stack switching might be needed.