Re: [PATCH 1/2] seccomp: notify user trap about unused filter
From: Kees Cook
Date: Wed May 27 2020 - 21:59:59 EST
On Thu, May 28, 2020 at 01:16:46AM +0200, Christian Brauner wrote:
> I'm also starting to think this isn't even possible or currently doable
> safely.
> The fdtable in the kernel would end up with a dangling pointer, I would
> think. Unless you backtrack all fds that still have a reference into the
> fdtable and refer to that file and close them all in the kernel which I
> don't think is possible and also sounds very dodgy. This also really
> seems like we would be breaking a major contract, namely that fds stay
> valid until userspace calls close, execve(), or exits.
Right, I think I was just using the wrong words? I was looking at it
like a pipe, or a socket, where you still have an fd, but reads return
0, you might get SIGPIPE, etc. The VFS clearly knows what a
"disconnected" fd is, and I had assumed there was general logic for it
to indicate "I'm not here any more".
I recently did something very similar to the pstore filesystem, but I got
to cheat with some massive subsystem locks. In that case I needed to clear
all the inodes out of the tmpfs, so I unlink them all and manage the data
lifetimes pointing back into the (waiting to be unloaded) backend module
by NULLing the pointer back, which is safe because of the how the locking
there happens to work. Any open readers, when they close, will have the
last ref count dropped, at which point the record itself is released too.
Back to the seccomp subject: should "all tasks died" be distinguishable
from "I can't find that notification" in the ioctl()? (i.e. is ENOENT
sufficient, or does there need to be an EIO or ESRCH there?)
--
Kees Cook