Re: problem starting /sbin/init (32-bit 5.3-rc8)

From: Kees Cook
Date: Thu Sep 12 2019 - 21:51:23 EST


On Thu, Sep 12, 2019 at 05:16:02PM -0700, Kees Cook wrote:
> On Thu, Sep 12, 2019 at 02:40:19PM -0700, Randy Dunlap wrote:
> > This is 32-bit kernel, just happens to be running on a 64-bit laptop.
> > I added the debug printk in __phys_addr() just before "[cut here]".
> >
> > CONFIG_HARDENED_USERCOPY=y
>
> I can reproduce this under CONFIG_DEBUG_VIRTUAL=y, and it goes back
> to at least to v5.2. Booting with "hardened_usercopy=off" or without
> CONFIG_DEBUG_VIRTUAL makes this go away (since __phys_addr() doesn't
> get called):
>
> __check_object_size+0xff/0x1b0:
> pfn_to_section_nr at include/linux/mmzone.h:1153
> (inlined by) __pfn_to_section at include/linux/mmzone.h:1291
> (inlined by) virt_to_head_page at include/linux/mm.h:729
> (inlined by) check_heap_object at mm/usercopy.c:230
> (inlined by) __check_object_size at mm/usercopy.c:280
>
> Is virt_to_head_page() illegal to use under some recently new conditions?

This combination appears to be bugged since the original introduction
of hardened usercopy in v4.8. Is this an untested combination until
now? (I don't usually do tests with CONFIG_DEBUG_VIRTUAL, but I guess
I will from now on!)

Note from the future (i.e. the end of this email where I figure it out):
it turns out it's actually these three together:

CONFIG_HIGHMEM=y
CONFIG_DEBUG_VIRTUAL=y
CONFIG_HARDENED_USERCOPY=y

>
> > The BUG is this line in arch/x86/mm/physaddr.c:
> > VIRTUAL_BUG_ON((phys_addr >> PAGE_SHIFT) > max_low_pfn);
> > It's line 83 in my source file only due to adding <linux/printk.h> and
> > a conditional pr_crit() call.

What exactly is this trying to test?

> > [ 19.730409][ T1] debug: unmapping init [mem 0xdc7bc000-0xdca30fff]
> > [ 19.734289][ T1] Write protecting kernel text and read-only data: 13888k
> > [ 19.737675][ T1] rodata_test: all tests were successful
> > [ 19.740757][ T1] Run /sbin/init as init process
> > [ 19.792877][ T1] __phys_addr: max_low_pfn=0x36ffe, x=0xff001ff1, phys_addr=0x3f001ff1

It seems like this address is way out of range of the physical memory.
That seems like it's vmalloc or something, but that was actually
explicitly tested for back in the v4.8 version (it became unneeded
later).

> > [ 19.796561][ T1] ------------[ cut here ]------------
> > [ 19.797501][ T1] kernel BUG at ../arch/x86/mm/physaddr.c:83!
> > [ 19.802799][ T1] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> > [ 19.803782][ T1] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc8 #6
> > [ 19.803782][ T1] Hardware name: Dell Inc. Inspiron 1318 /0C236D, BIOS A04 01/15/2009
> > [ 19.803782][ T1] EIP: __phys_addr+0xaf/0x100
> > [ 19.803782][ T1] Code: 85 c0 74 67 89 f7 c1 ef 0c 39 f8 73 2e 56 53 50 68 90 9f 1f dc 68 00 eb 45 dc e8 ec b3 09 00 83 c4 14 3b 3d 30 55 cf dc 76 11 <0f> 0b b8 7c 3b 5c dc e8 45 53 4c 00 90 8d 74 26 00 89 d8 e8 39 cd
> > [ 19.803782][ T1] EAX: 00000044 EBX: ff001ff1 ECX: 00000000 EDX: db90a471
> > [ 19.803782][ T1] ESI: 3f001ff1 EDI: 0003f001 EBP: f41ddea0 ESP: f41dde90
> > [ 19.803782][ T1] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010216
> > [ 19.803782][ T1] CR0: 80050033 CR2: dc218544 CR3: 1ca39000 CR4: 000406d0
> > [ 19.803782][ T1] Call Trace:
> > [ 19.803782][ T1] __check_object_size+0xaf/0x3c0
> > [ 19.803782][ T1] ? __might_sleep+0x80/0xa0
> > [ 19.803782][ T1] copy_strings+0x1c2/0x370

Oh, this is actually copying into a kmap() pointer due to the weird
stuff exec() does:

kaddr = kmap(kmapped_page);
...
if (copy_from_user(kaddr+offset, str, bytes_to_copy)) {

> > [ 19.803782][ T1] copy_strings_kernel+0x2b/0x40
> >
> > Full boot log or kernel .config file are available if wanted.

Is kmap somewhere "unexpected" in this case? Ah-ha, yes, it seems it is.
There is even a helper to do the "right" thing as virt_to_page(). This
seems to be used very rarely in the kernel... is there a page type for
kmap pages? This seems like a hack, but it fixes it:


diff --git a/mm/usercopy.c b/mm/usercopy.c
index 98e924864554..5a14b80ad63e 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -11,6 +11,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

#include <linux/mm.h>
+#include <linux/highmem.h>
#include <linux/slab.h>
#include <linux/sched.h>
#include <linux/sched/task.h>
@@ -227,7 +228,7 @@ static inline void check_heap_object(const void *ptr, unsigned long n,
if (!virt_addr_valid(ptr))
return;

- page = virt_to_head_page(ptr);
+ page = compound_head(kmap_to_page((void *)ptr));

if (PageSlab(page)) {
/* Check slab allocator for flags and size. */


What's the right way to "ignore" the kmap range? (i.e. it's not Slab, so
ignore it here: I can't find a page type nor a "is this kmap?" helper...)

--
Kees Cook