Re: [PATCH] kernel: introduce prctl(PR_LOG_UACCESS)

From: Kees Cook
Date: Wed Sep 22 2021 - 11:30:50 EST


On Wed, Sep 22, 2021 at 09:23:10AM -0500, Eric W. Biederman wrote:
> Peter Collingbourne <pcc@xxxxxxxxxx> writes:
>
> > This patch introduces a kernel feature known as uaccess logging.
> > With uaccess logging, the userspace program passes the address and size
> > of a so-called uaccess buffer to the kernel via a prctl(). The prctl()
> > is a request for the kernel to log any uaccesses made during the next
> > syscall to the uaccess buffer. When the next syscall returns, the address
> > one past the end of the logged uaccess buffer entries is written to the
> > location specified by the third argument to the prctl(). In this way,
> > the userspace program may enumerate the uaccesses logged to the access
> > buffer to determine which accesses occurred.
> > [...]
> > 3) Kernel fuzzing. We may use the list of reported kernel accesses to
> > guide a kernel fuzzing tool such as syzkaller (so that it knows which
> > parts of user memory to fuzz), as an alternative to providing the tool
> > with a list of syscalls and their uaccesses (which again thanks to
> > (2) may not be accurate).
>
> How is logging the kernel's activity like this not a significant
> information leak? How is this safe for unprivileged users?

This does result in userspace being able to "watch" the kernel progress
through a syscall. I'd say it's less dangerous than userfaultfd, but
still worrisome. (And userfaultfd is normally disabled[1] for unprivileged
users trying to interpose the kernel accessing user memory.)

Regardless, this is a pretty useful tool for this kind of fuzzing.
Perhaps the timing exposure could be mitigated by having the kernel
collect the record in a separate kernel-allocated buffer and flush the
results to userspace at syscall exit? (This would solve the
copy_to_user() recursion issue too.)

I'm pondering what else might be getting exposed by creating this level
of probing... kernel addresses would already be getting rejected, so
they wouldn't show up in the buffer. Hmm. Jann, any thoughts here?


Some other thoughts:


Instead of reimplementing copy_*_user() with a new wrapper that
bypasses some checks and adds others and has to stay in sync, etc,
how about just adding a "recursion" flag? Something like:

copy_from_user(...)
instrument_copy_from_user(...)
uaccess_buffer_log_read(...)
if (current->uaccess_buffer.writing)
return;
uaccess_buffer_log(...)
current->uaccess_buffer.writing = true;
copy_to_user(...)
current->uaccess_buffer.writing = false;


How about using this via seccomp instead of a per-syscall prctl? This
would mean you would have very specific control over which syscalls
should get the uaccess tracing, and wouldn't need to deal with
the signal mask (I think). I would imagine something similar to
SECCOMP_FILTER_FLAG_LOG, maybe SECCOMP_FILTER_FLAG_UACCESS_TRACE, and
add a new top-level seccomp command, (like SECCOMP_GET_NOTIF_SIZES)
maybe named SECCOMP_SET_UACCESS_TRACE_BUFFER.

This would likely only make sense for SECCOMP_RET_TRACE or _TRAP if the
program wants to collect the results after every syscall. And maybe this
won't make any sense across exec (losing the mm that was used during
SECCOMP_SET_UACCESS_TRACE_BUFFER). Hmmm.


-Kees

[1] https://git.kernel.org/linus/d0d4730ac2e404a5b0da9a87ef38c73e51cb1664

--
Kees Cook