Re: [PATCH] seccomp: allow BPF_MOD ALU instructions
From: Anton Protopopov
Date: Wed Mar 18 2020 - 11:23:33 EST
ÑÑ, 18 ÐÐÑ. 2020 Ð. Ð 00:06, Kees Cook <keescook@xxxxxxxxxxxx>:
>
> On Tue, Mar 17, 2020 at 09:11:57PM -0400, Anton Protopopov wrote:
> > ÐÑ, 17 ÐÐÑ. 2020 Ð. Ð 16:21, Kees Cook <keescook@xxxxxxxxxxxx>:
> > >
> > > On Mon, Mar 16, 2020 at 06:17:34PM -0400, Anton Protopopov wrote:
> > > > and in every case to walk only a corresponding factor-list. In my case
> > > > I had a list of ~40 syscall numbers and after this change filter
> > > > executed in 17.25 instructions on average per syscall vs. 45
> > > > instructions for the linear filter (so this removes about 30
> > > > instructions penalty per every syscall). To replace "mod #4" I
> > > > actually used "and #3", but this obviously doesn't work for
> > > > non-power-of-two divisors. If I would use "mod 5", then it would give
> > > > me about 15.5 instructions on average.
> > >
> > > Gotcha. My real concern is with breaking the ABI here -- using BPF_MOD
> > > would mean a process couldn't run on older kernels without some tricks
> > > on the seccomp side.
> >
> > Yes, I understood. Could you tell what would you do exactly if there
> > was a real need in a new instruction?
>
> I'd likely need to introduce some kind of way to query (and declare) the
> "language version" of seccomp filters. New programs would need to
> declare the language level (EINVAL would mean the program must support
> the original "v1", ENOTSUPP would mean "kernel doesn't support that
> level"), and the program would have to build a filter based on the
> supported language features. The kernel would assume all undeclared
> seccomp users were "v1" and would need to reject BPF_MOD. All programs
> declaring "v2" would be allowed to use BPF_MOD.
>
> It's really a lot for something that isn't really needed. :)
Right :) Thanks for the explanations!
> > > Since the syscall list is static for a given filter, why not arrange it
> > > as a binary search? That should get even better average instructions
> > > as O(log n) instead of O(n).
> >
> > Right, thanks! This saves about 4 more instructions for my case and
> > works 1-2 ns faster.
>
> Excellent!
>
> > > Though frankly I've also been considering an ABI version bump for adding
> > > a syscall bitmap feature: the vast majority of seccomp filters are just
> > > binary yes/no across a list of syscalls. Only the special cases need
> > > special handling (arg inspection, fd notification, etc). Then these
> > > kinds of filters could run as O(1).
>
> *This* feature wouldn't need my crazy language version idea, but it
> _would_ still need to be detectable, much like how RET_USER_NOTIF was
> added.
>
> --
> Kees Cook