Re: [PATCH 1/2 v2] kprobe: Do not use uaccess functions to access kernel memory that can fault

From: Linus Torvalds
Date: Fri Feb 22 2019 - 12:43:39 EST

On Fri, Feb 22, 2019 at 12:35 AM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> Or, can we do this?
> long __probe_user_read(void *dst, const void *src, size_t size)
> {

Add a

if (!access_ok(src, size))
ret = -EFAULT;
else {
.. do the pagefault_disable() etc ..

to after the "set_fs()", and it looks good to me. Make it clear that
yes, this works _only_ for user reads.

Adn that makes all the games with "kernel_uaccess_faults_ok"
pointless, so you can just remove them.

(note that the "access_ok()" has to come after we've done "set_fs()",
because it takes the address limit from that).

Also, since normally we'd expect that we already have USER_DS, it
might be worthwhile to do this with a wrapper, something along the
lines of

mm_segment_t old_fs = get_fs();

if (segment_eq(old_fs, USER_DS))
return __normal_probe_user_read();
ret = __normal_probe_user_read();
return ret;

and have that __normal_probe_user_read() just do

if (!access_ok(src, size))
return -EFAULT;
ret = __copy_from_user_inatomic(dst, ...);
return ret ? -EFAULT : 0;

which looks more obvious.

Also, I would suggest that you just make the argument type be "const
void __user *", since the whole point is that this takes a user
pointer, and nothing else.

Then we should still probably fix up "__probe_kernel_read()" to not
allow user accesses. The easiest way to do that is actually likely to
use the "unsafe_get_user()" functions *without* doing a
uaccess_begin(), which will mean that modern CPU's will simply fault
on a kernel access to user space.

The nice thing about that is that usually developers will have access
to exactly those modern boxes, so the people who notice that it
doesn't work are the right people.

Alternatively, we should just make it be architecture-specific, so
that architectures can decide "this address cannot be a kernel
address" and refuse to do it.