Re: [PATCH RFC] usercopy: optimize stack check flow when the page-spanning test is disabled

From: Kees Cook
Date: Tue Aug 14 2018 - 14:54:25 EST

(Please use contextual quoting in replies... mixing contextual with
top-posting becomes very hard to read...)

On Tue, Aug 14, 2018 at 6:02 AM, Yuanxiaofeng (XiAn)
<yuanxiaofeng1@xxxxxxxxxx> wrote:
> On Tue, Aug 14, 2018 at 8:35PM Matthew Wilcox wrote:
>> On Tue, Aug 14, 2018 at 08:17:31PM +0800, Xiaofeng Yuan wrote:
>>> The check_heap_object() checks the spanning multiple pages and slab.
>>> When the page-spanning test is disabled, the check_heap_object() is
>>> redundant for spanning multiple pages. However, the kernel stacks are
>>> multiple pages under certain conditions: CONFIG_ARCH_THREAD_STACK_ALLOCATOR
>>> is not defined and (THREAD_SIZE >= PAGE_SIZE). At this point, We can skip
>>> the check_heap_object() for kernel stacks to improve performance.
>>> Similarly, the virtually-mapped stack can skip check_heap_object() also,
>>> beacause virt_addr_valid() will return.
>> Why not just check_stack_object() first, then check_heap_object() second?

Most of the dynamically-sized copies (i.e. those that will trigger
__check_object_size being used at all) come out of heap. Stack copies
tend to be a fixed size. That said, the stack check is pretty cheap:
if it's not bounded by task_stack_page(current) ... +THREAD_SIZE, it
kicks out immediately. The frame-walking will only happen if it IS
actually stack (and once finished will short-circuit all remaining

> 1, When the THREAD_SIZE is less than PAGE_SIZE, the stack will allocate memory by kmem_cache_alloc_node(), it's slab memory and will execute __check_heap_object().

Correct, though if an architecture supports stack frame analysis, this
is a more narrow check than the bulk heap object check. (i.e. it may
have sub-object granularity to determine if a copy spans a stack
frame.) This supports the idea of just doing the stack check first,

> 2, When CONFIG_HARDENED_USERCOPY_PAGESPAN is enabled, the multiple-pages stacks will do some check in check_page_span().

PAGESPAN checking is buggy for a lot of reasons, unfortunately. It
should generally stay disabled unless someone is working on getting
rid of allocations that _should_ have marked themselves as spanning
pages. It's unclear if this is even a solvable problem in the kernel
right now due to how networking code manages skbs.

> So, I set some restrictions to make sure the useful check will not be skipped.

It'd be nice to find some workloads that visibly change by making the
heap/stack order change. I think the known worst-case (small-packet
UDP flooding) wouldn't get worse since both checks will be performed
in either case.

(Maybe we should also short-circuit early in heap checks if it IS a
valid heap object: no reason to go do the kernel text check after


Kees Cook
Pixel Security