Re: [PATCH] kernel: Expose SYS_kcmp by default

From: Michel Dänzer
Date: Mon Feb 08 2021 - 08:12:53 EST


On 2021-02-08 12:49 p.m., Michel Dänzer wrote:
On 2021-02-05 9:53 p.m., Daniel Vetter wrote:
On Fri, Feb 5, 2021 at 7:37 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote:

On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote:
Userspace has discovered the functionality offered by SYS_kcmp and has
started to depend upon it. In particular, Mesa uses SYS_kcmp for
os_same_file_description() in order to identify when two fd (e.g. device
or dmabuf) point to the same struct file. Since they depend on it for
core functionality, lift SYS_kcmp out of the non-default
CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Kees Cook <keescook@xxxxxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
Cc: Will Drewry <wad@xxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Dave Airlie <airlied@xxxxxxxxx>
Cc: Daniel Vetter <daniel@xxxxxxxx>
Cc: Lucas Stach <l.stach@xxxxxxxxxxxxxx>
---
  init/Kconfig                                  | 11 +++++++++++
  kernel/Makefile                               |  2 +-
  tools/testing/selftests/seccomp/seccomp_bpf.c |  2 +-
  3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index b77c60f8b963..f62fca13ac5b 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1194,6 +1194,7 @@ endif # NAMESPACES
  config CHECKPOINT_RESTORE
       bool "Checkpoint/restore support"
       select PROC_CHILDREN
+     select KCMP
       default n
       help
         Enables additional kernel features in a sake of checkpoint/restore.
@@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
  config ARCH_HAS_MEMBARRIER_SYNC_CORE
       bool

+config KCMP
+     bool "Enable kcmp() system call" if EXPERT
+     default y

I would expect this to be not default-y, especially if
CHECKPOINT_RESTORE does a "select" on it.

This is a really powerful syscall, but it is bounded by ptrace access
controls, and uses pointer address obfuscation, so it may be okay to
expose this. As it is, at least Ubuntu already has
CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much
difference on exposure.

So, if you drop the "default y", I'm fine with this.

It was maybe stupid, but our userspace started relying on fd
comaprison through sys_kcomp. So for better or worse, if you want to
run the mesa3d gl/vk stacks, you need this.

That's overstating things somewhat. The vast majority of applications will work fine regardless (as they did before Mesa started using this functionality). Only some special ones will run into issues, because the user-space drivers incorrectly assume two file descriptors reference different descriptions.


Was maybe not the brighest ideas, but since enough distros had this
enabled by defaults,

Right, that (and the above) is why I considered it fair game to use. What should I have done instead? (TBH I was surprised that this functionality isn't generally available)

In that spirit, an alternative might be to make KCMP_FILE available unconditionally, and the rest of SYS_kcmp only with CHECKPOINT_RESTORE as before. (Or maybe other parts of SYS_kcmp are generally useful as well?)


--
Earthling Michel Dänzer | https://redhat.com
Libre software enthusiast | Mesa and X developer