Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

From: Andy Lutomirski
Date: Mon Feb 18 2019 - 14:16:03 EST


On Mon, Feb 18, 2019 at 5:04 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

> > Another would be to have the buffer passed to flush_buffer() (i.e.
> > the callback of decompress_fn) allocated with 4 bytes of padding
> > past the part where the unpacked piece of data is placed for the
> > callback to find. As in,
> >
> > diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c
> > index 63b4b7eee138..ca3f7ecc9b35 100644
> > --- a/lib/decompress_inflate.c
> > +++ b/lib/decompress_inflate.c
> > @@ -48,7 +48,7 @@ STATIC int INIT __gunzip(unsigned char *buf, long len,
> > rc = -1;
> > if (flush) {
> > out_len = 0x8000; /* 32 K */
> > - out_buf = malloc(out_len);
> > + out_buf = malloc(out_len + 4);
>
> +8 actually.
>
> > } else {
> > if (!out_len)
> > out_len = ((size_t)~0) - (size_t)out_buf; /* no limit */
> >
> > for gunzip/decompress and similar ones for bzip2, etc. The contents
> > layout doesn't have anything to do with that...
>
> Right. That works nicely.
>

This seems like it's just papering over the underlying problem: with
Jann's new checks in place, strncpy_from_user() is simply buggy. Does
the patch below look decent? It's only compile-tested, but it's
conceptually straightforward. I was hoping I could get rid of the
check-maximum-address stuff, but it's needed for architectures where
the user range is adjacent to the kernel range (i.e. not x86_64).

Jann, I'm still unhappy that this code will write up to sizeof(long)-1
user-controlled garbage bytes in-bounds past the null-terminator in
the kernel buffer. Do you think that's worth changing, too? I don't
think it's a bug per se, but it seems like a nifty little wart for an
attacker to try to abuse.

On brief inspection, strnlen_user() does not have an equivalent bug.
diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
index 58eacd41526c..709d6efe0d42 100644
--- a/lib/strncpy_from_user.c
+++ b/lib/strncpy_from_user.c
@@ -10,12 +10,7 @@
#include <asm/byteorder.h>
#include <asm/word-at-a-time.h>

-#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
-#define IS_UNALIGNED(src, dst) 0
-#else
-#define IS_UNALIGNED(src, dst) \
- (((long) dst | (long) src) & (sizeof(long) - 1))
-#endif
+#define IS_UNALIGNED(addr) (((long)(addr)) & (sizeof(long) - 1))

/*
* Do a strncpy, return length of string without final '\0'.
@@ -35,14 +30,39 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, long
if (max > count)
max = count;

- if (IS_UNALIGNED(src, dst))
+ /*
+ * First handle any unaligned prefix of src.
+ */
+ while (max && IS_UNALIGNED(src+res)) {
+ char c;
+
+ unsafe_get_user(c, src+res, efault);
+ dst[res] = c;
+ if (!c)
+ return res;
+ res++;
+ max--;
+ }
+
+ /*
+ * Now we know that src + res is aligned. If dst is unaligned and
+ * we don't have efficient unaligned access, then keep going one
+ * byte at a time. (This could be optimized, but it would make
+ * the code more complicated.
+ */
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+ if (IS_UNALIGNED(dst + res))
goto byte_at_a_time;
+#endif

while (max >= sizeof(unsigned long)) {
+ /*
+ * src + res is aligned, so the reads in this loop will
+ * not cross a page boundary.
+ */
unsigned long c, data;

- /* Fall back to byte-at-a-time if we get a page fault */
- unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time);
+ unsafe_get_user(c, (unsigned long __user *)(src+res), efault);

*(unsigned long *)(dst+res) = c;
if (has_zero(c, &data, &constants)) {
@@ -54,7 +74,9 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, long
max -= sizeof(unsigned long);
}

-byte_at_a_time:
+ /*
+ * Finish the job one byte at a time.
+ */
while (max) {
char c;