Re: [PATCH RFC] seccomp: Implement syscall isolation based on memory areas

From: Billy Laws
Date: Mon Jun 01 2020 - 05:23:39 EST


> On May 30, 2020, at 5:26 PM, Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxx> wrote:
>
> ïAndy Lutomirski <luto@xxxxxxxxxxxxxx> writes:
>
>>>> On May 29, 2020, at 11:00 PM, Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxx> wrote:
>>>
>>> ïModern Windows applications are executing system call instructions
>>> directly from the application's code without going through the WinAPI.
>>> This breaks Wine emulation, because it doesn't have a chance to
>>> intercept and emulate these syscalls before they are submitted to Linux.
>>>
>>> In addition, we cannot simply trap every system call of the application
>>> to userspace using PTRACE_SYSEMU, because performance would suffer,
>>> since our main use case is to run Windows games over Linux. Therefore,
>>> we need some in-kernel filtering to decide whether the syscall was
>>> issued by the wine code or by the windows application.
>>
>> Do you really need in-kernel filtering? What if you could have
>> efficient userspace filtering instead? That is, set something up so
>> that all syscalls, except those from a special address, are translated
>> to CALL thunk where the thunk is configured per task. Then the thunk
>> can do whatever emulation is needed.
>
> Hi,
>
> I suggested something similar to my customer, by using
> libsyscall-intercept. The idea would be overwritting the syscall
> instruction with a call to the entry point. I'm not a specialist on the
> specifics of Windows games, (cc'ed Paul Gofman, who can provide more
> details on that side), but as far as I understand, the reason why that
> is not feasible is that the anti-cheat protection in games will abort
> execution if the binary region was modified either on-disk or in-memory.
>
> Is there some mechanism to do that without modiyfing the application?

Hi,

I work on an emulator for the Nintendo Switch that uses a similar technique,
in our testing it works very well and is much more performant than even
PTRACE_SYSEMU.

To work around DRM reading the memory contents I think mprotect could
be used, after patching the syscall a copy of the original code could be
kept somewhere in memory and the patched region mapped --X.
With this, any time the DRM attempts to read to the patched region and
perform integrity checks it will cause a segfault and a branch to the
signal handler. This handler can then return the contents of the original,
unpatched region to satisfy them checks.

Are memory contents checked by DRM solutions too often for this to be
performant?
--
Billy Laws
>
>> Getting the details and especially the interaction with any seccomp
>> filters that may be installed right could be tricky, but the performance
>> should be decent, at least on non-PTI systems.
>>
>> (If we go this route, I suspect that the correct interaction with
>> seccomp is that this type of redirection takes precedence over seccomp
>> and seccomp filters are not invoked for redirected syscalls. After all,
>> a redirected syscall is, functionally, not a syscall at all.)
>>
>
>
> --
> Gabriel Krisman Bertazi