Re: Does CONFIG_HARDENED_USERCOPY break /dev/mem?

From: Kees Cook
Date: Wed Nov 22 2017 - 12:43:29 EST


On Wed, Nov 22, 2017 at 1:28 AM, Michael Holzheu
<holzheu@xxxxxxxxxxxxxxxxxx> wrote:
> Am Mon, 13 Nov 2017 11:19:38 +0100
> schrieb Michael Holzheu <holzheu@xxxxxxxxxxxxxxxxxx>:
>
>> Am Fri, 10 Nov 2017 10:46:49 -0800
>> schrieb Kees Cook <keescook@xxxxxxxxxxxx>:
>>
>> > On Fri, Nov 10, 2017 at 7:45 AM, Michael Holzheu
>> > <holzheu@xxxxxxxxxxxxxxxxxx> wrote:
>> > > Hello Kees,
>> > >
>> > > When I try to run the crash tool on my s390 live system I get a kernel panic
>> > > when reading memory within the kernel image:
>> > >
>> > > # uname -a
>> > > Linux r3545011 4.14.0-rc8-00066-g1c9dbd4615fd #45 SMP PREEMPT Fri Nov 10 16:16:22 CET 2017 s390x s390x s390x GNU/Linux
>> > > # crash /boot/vmlinux-devel /dev/mem
>> > > # crash> rd 0x100000
>> > >
>> > > usercopy: kernel memory exposure attempt detected from 0000000000100000 (<kernel text>) (8 bytes)
>> > > ------------[ cut here ]------------
>> > > kernel BUG at mm/usercopy.c:72!
>> > > illegal operation: 0001 ilc:1 [#1] PREEMPT SMP.
>> > > Modules linked in:
>> > > CPU: 0 PID: 1461 Comm: crash Not tainted 4.14.0-rc8-00066-g1c9dbd4615fd-dirty #46
>> > > Hardware name: IBM 2827 H66 706 (z/VM 6.3.0)
>> > > task: 000000001ad10100 task.stack: 000000001df78000
>> > > Krnl PSW : 0704d00180000000 000000000038165c (__check_object_size+0x164/0x1d0)
>> > > R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
>> > > Krnl GPRS: 0000000012440e1d 0000000080000000 0000000000000061 00000000001cabc0
>> > > 00000000001cc6d6 0000000000000000 0000000000cc4ed2 0000000000001000
>> > > 000003ffc22fdd20 0000000000000008 0000000000100008 0000000000000001
>> > > 0000000000000008 0000000000100000 0000000000381658 000000001df7bc90
>> > > Krnl Code: 000000000038164c: c020004a1c4a larl %r2,cc4ee0
>> > > 0000000000381652: c0e5fff2581b brasl %r14,1cc688
>> > > #0000000000381658: a7f40001 brc 15,38165a
>> > > >000000000038165c: eb42000c000c srlg %r4,%r2,12
>> > > 0000000000381662: eb32001c000c srlg %r3,%r2,28
>> > > 0000000000381668: c0110003ffff lgfi %r1,262143
>> > > 000000000038166e: ec31ff752065 clgrj %r3,%r1,2,381558
>> > > 0000000000381674: a7f4ff67 brc 15,381542
>> > > Call Trace:
>> > > ([<0000000000381658>] __check_object_size+0x160/0x1d0)
>> > > [<000000000082263a>] read_mem+0xaa/0x130.
>> > > [<0000000000386182>] __vfs_read+0x42/0x168.
>> > > [<000000000038632e>] vfs_read+0x86/0x140.
>> > > [<0000000000386a26>] SyS_read+0x66/0xc0.
>> > > [<0000000000ace6a4>] system_call+0xc4/0x2b0.
>> > > INFO: lockdep is turned off.
>> > > Last Breaking-Event-Address:
>> > > [<0000000000381658>] __check_object_size+0x160/0x1d0
>> > >
>> > > Kernel panic - not syncing: Fatal exception: panic_on_oops
>> > >
>> > > With CONFIG_HARDENED_USERCOPY copy_to_user() checks in __check_object_size()
>> > > if the source address is within the kernel image:
>> > >
>> > > - __check_object_size() -> check_kernel_text_object():
>> > >
>> > > /* Is this address range in the kernel text area? */
>> > > static inline const char *check_kernel_text_object(const void *ptr,
>> > > unsigned long n)
>> > > {
>> > > unsigned long textlow = (unsigned long)_stext;
>> > > unsigned long texthigh = (unsigned long)_etext;
>> > > unsigned long textlow_linear, texthigh_linear;
>> > >
>> > > if (overlaps(ptr, n, textlow, texthigh))
>> > > return "<kernel text>";
>> > >
>> > > When the crash tool reads from 0x100000, this check leads to the kernel BUG()
>> > > in drivers/char/mem.c:
>> > >
>> > > 144 } else {
>> > > 145 /*
>> > > 146 * On ia64 if a page has been mapped somewhere as
>> > > 147 * uncached, then it must also be accessed uncached
>> > > 148 * by the kernel or data corruption may occur.
>> > > 149 */
>> > > 150 ptr = xlate_dev_mem_ptr(p);
>> > > 151 if (!ptr)
>> > > 152 return -EFAULT;
>> > > 153
>> > > 154 remaining = copy_to_user(buf, ptr, sz); <<<---- BUG
>> > > 155
>> > > 156 unxlate_dev_mem_ptr(p, ptr);
>> > > 157 }
>> > >
>> > > Here the reporting function in mm/usercopy.c:
>> > >
>> > > 61 static void report_usercopy(const void *ptr, unsigned long len,
>> > > 62 bool to_user, const char *type)
>> > > 63 {
>> > > 64 pr_emerg("kernel memory %s attempt detected %s %p (%s) (%lu bytes)\n",
>> > > 65 to_user ? "exposure" : "overwrite",
>> > > 66 to_user ? "from" : "to", ptr, type ? : "unknown", len);
>> > > 67 /*
>> > > 68 * For greater effect, it would be nice to do do_group_exit(),
>> > > 69 * but BUG() actually hooks all the lock-breaking and per-arch
>> > > 70 * Oops code, so that is used here instead.
>> > > 71 */
>> > > 72 BUG();
>> > > 73 }
>> > >
>> > > Shouldn't we skip the kernel address check for /dev/mem - at least when
>> > > CONFIG_STRICT_DEVMEM is not enabled?
>> >
>> > Some kind of better interaction is needed here, I agree. The prior
>> > discussions around this basically resulted in declaring that
>> > HARDENED_USERCOPY without STRICT_DEVMEM didn't make a lot of sense.
>> > i.e. HARDENED_USERCOPY should maybe require STRICT_DEVMEM, etc. Tycho
>> > wrote this up after some back-and-forth:
>> >
>> > https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=kspp/kconfig&id=ae98b44ceb338ae165a7f18f29f6244893712da3
>> >
>> > In the end, I was still uncomfortable with it because the usercopy
>> > protection is wider than just the kernel text, so it seemed strange to
>> > force it off when not using STRICT_DEVMEM.
>> >
>> > What's the use-case here where you've got hardened usercopy without
>> > strict devmem?
>>
>> We use that configuration for development and test. We disabled STRICT_DEVMEM
>> for debugging the live system with crash. We enabled HARDENED_USERCOPY for
>> finding user-copy bugs.
>
> So what's your plan now? How will you fix this issue?

I think the best plan here would be to use the Kconfig "imply
STRICT_DEVMEM" in HARDENED_USERCOPY. That would make STRICT_DEVMEM
enabled by default but still configurable. Then the kernel-text check
in hardened usercopy could be skip when !STRICT_DEVMEM.

My primary concern is with failing closed. If someone is only paying
attention to choosing HARDENED_USERCOPY, it should not be possible to
read out kernel memory unless they specifically try to unset something
else (in this case, STRICT_DEVMEM).

How does that sound?

-Kees

--
Kees Cook
Pixel Security