Re: [PATCH v4 seccomp 5/5] seccomp/cache: Report cache data through /proc/pid/seccomp_cache

From: YiFei Zhu
Date: Tue Nov 03 2020 - 08:03:57 EST


On Fri, Oct 30, 2020 at 7:18 AM YiFei Zhu <zhuyifei1999@xxxxxxxxx> wrote:
> I got a bare metal test machine with Intel(R) Xeon(R) CPU E5-2660 v3 @
> 2.60GHz, running Ubuntu 18.04. Test kernels are compiled at
> 57a339117e52 ("selftests/seccomp: Compare bitmap vs filter overhead")
> and 3650b228f83a ("Linux 5.10-rc1"), built with Ubuntu's
> 5.3.0-64-generic's config, then `make olddefconfig`. "Mitigations off"
> indicate the kernel was booted with "nospectre_v2 nospectre_v1
> no_stf_barrier tsx=off tsx_async_abort=off".
>
> The benchmark was single-job make on x86_64 defconfig of 5.9.1, with
> CPU affinity to set only processor #0. Raw results are appended below.
> Each boot is tested by running the build directly and inside docker,
> with and without seccomp. The commands used are attached below. Each
> test is 4 trials, with the middle two (non-minimum, non-maximum) wall
> clock time averaged. Results summary:
>
> Mitigations On Mitigations Off
> With Cache Without Cache With Cache Without Cache
> Native 18:17.38 18:13.78 18:16.08 18:15.67
> D. no seccomp 18:15.54 18:17.71 18:17.58 18:16.75
> D. + seccomp 20:42.47 20:45.04 18:47.67 18:49.01
>
> To be honest, I'm somewhat surprised that it didn't produce as much of
> a dent in the seccomp overhead in this macro benchmark as I had
> expected.

My peers pointed out that in my previous benchmark there are still a
few mitigations left on, and suggested to use "noibrs noibpb nopti
nospectre_v2 nospectre_v1 l1tf=off nospec_store_bypass_disable
no_stf_barrier mds=off tsx=on tsx_async_abort=off mitigations=off".
Results with "Mitigations Off" updated:

Mitigations On Mitigations Off
With Cache Without Cache With Cache Without Cache
Native 18:17.38 18:13.78 17:43.42 17:47.68
D. no seccomp 18:15.54 18:17.71 17:34.59 17:37.54
D. + seccomp 20:42.47 20:45.04 17:35.70 17:37.16

Whether seccomp is on or off seems not to make much of a difference
for this benchmark. Bitmap being enabled does seem to decrease the
overall compilation time but it also affects where seccomp is off, so
the speedup is probably from other factors. We are thinking about
using more syscall-intensive workloads, such as httpd.

Thugh, this does make me wonder, where does the 3-minute overhead with
seccomp with mitigations come from? Is it data cache misses? If that
is the case, can we somehow preload the seccomp bitmap cache maybe? I
mean, mitigations only cause around half a minute slowdown without
seccomp but seccomp somehow amplify the slowdown with an additional
2.5 minutes, so something must be off here.

This is the raw output for the time commands:

==== with cache, mitigations off ====

947.02user 108.62system 17:47.65elapsed 98%CPU (0avgtext+0avgdata
239804maxresident)k
25112inputs+217152outputs (166major+51934447minor)pagefaults 0swaps

947.91user 108.20system 17:46.53elapsed 99%CPU (0avgtext+0avgdata
239576maxresident)k
0inputs+217152outputs (0major+51941524minor)pagefaults 0swaps

948.33user 108.70system 17:47.72elapsed 98%CPU (0avgtext+0avgdata
239604maxresident)k
0inputs+217152outputs (0major+51938566minor)pagefaults 0swaps

948.65user 108.81system 17:48.41elapsed 98%CPU (0avgtext+0avgdata
239692maxresident)k
0inputs+217152outputs (0major+51935349minor)pagefaults 0swaps


932.12user 113.68system 17:37.24elapsed 98%CPU (0avgtext+0avgdata
239660maxresident)k
0inputs+217152outputs (0major+51547571minor)pagefaults 0swap

931.69user 114.12system 17:37.84elapsed 98%CPU (0avgtext+0avgdata
239448maxresident)k
0inputs+217152outputs (0major+51539964minor)pagefaults 0swaps

932.25user 113.39system 17:37.75elapsed 98%CPU (0avgtext+0avgdata
239372maxresident)k
0inputs+217152outputs (0major+51538018minor)pagefaults 0swaps

931.09user 114.25system 17:37.34elapsed 98%CPU (0avgtext+0avgdata
239508maxresident)k
0inputs+217152outputs (0major+51537700minor)pagefaults 0swaps


929.96user 113.42system 17:36.23elapsed 98%CPU (0avgtext+0avgdata
239448maxresident)k
984inputs+217152outputs (22major+51544059minor)pagefaults 0swaps

929.73user 115.13system 17:38.09elapsed 98%CPU (0avgtext+0avgdata
239464maxresident)k
0inputs+217152outputs (0major+51540259minor)pagefaults 0swaps

930.13user 112.71system 17:36.17elapsed 98%CPU (0avgtext+0avgdata
239620maxresident)k
0inputs+217152outputs (0major+51540623minor)pagefaults 0swaps

930.57user 113.02system 17:49.70elapsed 97%CPU (0avgtext+0avgdata
239432maxresident)k
0inputs+217152outputs (0major+51537776minor)pagefaults 0swaps

==== without cache, mitigations off ====

947.59user 108.06system 17:44.56elapsed 99%CPU (0avgtext+0avgdata
239484maxresident)k
25112inputs+217152outputs (167major+51938723minor)pagefaults 0swaps

947.95user 108.58system 17:43.40elapsed 99%CPU (0avgtext+0avgdata
239580maxresident)k
0inputs+217152outputs (0major+51943434minor)pagefaults 0swaps

948.54user 106.62system 17:42.39elapsed 99%CPU (0avgtext+0avgdata
239608maxresident)k
0inputs+217152outputs (0major+51936408minor)pagefaults 0swaps

947.85user 107.92system 17:43.44elapsed 99%CPU (0avgtext+0avgdata
239656maxresident)k
0inputs+217152outputs (0major+51931633minor)pagefaults 0swaps


931.28user 111.16system 17:33.59elapsed 98%CPU (0avgtext+0avgdata
239440maxresident)k
0inputs+217152outputs (0major+51543540minor)pagefaults 0swaps

930.21user 112.56system 17:34.20elapsed 98%CPU (0avgtext+0avgdata
239400maxresident)k
0inputs+217152outputs (0major+51539699minor)pagefaults 0swaps

930.16user 113.74system 17:35.06elapsed 98%CPU (0avgtext+0avgdata
239344maxresident)k
0inputs+217152outputs (0major+51543072minor)pagefaults 0swaps

930.17user 112.77system 17:34.98elapsed 98%CPU (0avgtext+0avgdata
239176maxresident)k
0inputs+217152outputs (0major+51540777minor)pagefaults 0swaps


931.92user 113.31system 17:36.05elapsed 98%CPU (0avgtext+0avgdata
239520maxresident)k
984inputs+217152outputs (22major+51534636minor)pagefaults 0swaps

931.14user 112.81system 17:35.35elapsed 98%CPU (0avgtext+0avgdata
239524maxresident)k
0inputs+217152outputs (0major+51549007minor)pagefaults 0swaps

930.93user 114.56system 17:37.72elapsed 98%CPU (0avgtext+0avgdata
239360maxresident)k
0inputs+217152outputs (0major+51542191minor)pagefaults 0swaps

932.26user 111.54system 17:35.36elapsed 98%CPU (0avgtext+0avgdata
239572maxresident)k
0inputs+217152outputs (0major+51537921minor)pagefaults 0swaps

YiFei Zhu