Re: [5.2 REGRESSION] Generic vDSO breaks seccomp-enabled userspace on i386

From: Andy Lutomirski
Date: Mon Jul 22 2019 - 15:17:33 EST


On Mon, Jul 22, 2019 at 11:39 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>
> On Mon, Jul 22, 2019 at 08:31:32PM +0200, Thomas Gleixner wrote:
> > On Mon, 22 Jul 2019, Kees Cook wrote:
> > > Just so I'm understanding: the vDSO change introduced code to make an
> > > actual syscall on i386, which for most seccomp filters would be rejected?
> >
> > No. The old x86 specific VDSO implementation had a fallback syscall as
> > well, i.e. clock_gettime(). On 32bit clock_gettime() uses the y2038
> > endangered timespec.
> >
> > So when the VDSO was made generic we changed the internal data structures
> > to be 2038 safe right away. As a consequence the fallback syscall is not
> > clock_gettime(), it's clock_gettime64(). which seems to surprise seccomp.
>
> Okay, it's didn't add a syscall, it just changed it. Results are the
> same: conservative filters suddenly start breaking due to the different
> call. (And now I see why Andy's alias suggestion would help...)
>
> I'm not sure which direction to do with this. It seems like an alias
> list is a large hammer for this case, and a "seccomp-bypass when calling
> from vDSO" solution seems too fragile?
>

I don't like the seccomp bypass at all. If someone uses seccomp to
disallow all clock_gettime() variants, there shouldn't be a back door
to learn the time.

Here's the restart_syscall() logic that makes me want aliases: we have
different syscall numbers for restart_syscall() on 32-bit and 64-bit.
The logic to decide which one to use is dubious at best. I'd like to
introduce a restart_syscall2() that is identical to restart_syscall()
except that it has the same number on both variants.

--Andy