[RFC] memdup_user() and friends

From: Al Viro
Date: Sat Jan 06 2018 - 21:17:28 EST


After reviewing memdup_user() callers, I've found several places
where it got completely unbounded values passed for size (up to 2Gb),
as well as some bounded by ridiculously high values - e.g.
if (size > 1024 * 128) /* sane value */
return -EINVAL;

container = memdup_user(buf, size);
if (IS_ERR(container))
return PTR_ERR(container);
in sound/core/control.c:replace_user_tlv(). To add insult to injury, that
particular caller uses the result only for memcmp() and copy_to_user(), so
there's no point whatsoever to try and keep the copy physically contiguous.
IOW, it would be just fine with vmalloc()-based analogue of memdup_user().

There is a bunch of places where a kvmalloc-based analogue is
open-coded and they deserve a primitive of their own. We can't convert
memdup_user() itself to that - it would have to be paired with kvfree()
instead of kfree() and we *do* have callers that need kmalloc() - e.g.
when the copy is passed to usb_bulk_msg(), etc. So it needs to be
a separate primitive.

However, memdup_user() itself has another fun issue:
/*
* Always use GFP_KERNEL, since copy_from_user() can sleep and
* cause pagefault, which makes it pointless to use GFP_NOFS
* or GFP_ATOMIC.
*/
p = kmalloc_track_caller(len, GFP_KERNEL);
if (!p)
return ERR_PTR(-ENOMEM);
Sure, GFP_ATOMIC and GFP_NOFS would be pointless. However, GFP_USER and,
possibly, __GFP_NOWARN would not. As the matter of fact, of 200-odd callers
only for two I couldn't prove that they'd be fine with GFP_USER:
arch/x86/kvm/x86.c:2076: page = memdup_user(blob_addr + (page_num * PAGE_SIZE), PAGE_SIZE);
include/rdma/ib_verbs.h:2437: buf = memdup_user(p, len);

The former is weird xen crap used by one place in kvm_set_msr_common() and
I've given up trying to sort the call chains out; it might very well be fine
with GFP_USER, actually. The latter is a flaming atrocity from the
bowels of infiniband. Might or might not be fine with GFP_USER, but in
any case that code is utter crap. What it tries to do is verifying that given
range of userland memory contains only zeroes. Callers are also not pretty and
hard to track, as well.

Everything else is definitely fine with GFP_USER - it's stuff like "copy of ioctl
arguments in an ioctl never issued by the kernel code, must have come straight from
ioctl(2)" and things like that. IMO we should simply switch memdup_user() to
GFP_USER and be done with that. Limiting the size ought to be done by callers and
IMO there's no point in __GFP_NOWARN there.

What I propose is
* switch memdup_user() to GFP_USER
* add vmemdup_user(), using kvmalloc() instead of kmalloc() (also with
GFP_USER)
* switch open-coded instances of the latter to calling it
* switch some of the memdup_user() callers to vmemdup_user() - the ones that
don't need physically contiguous copy and might be larger than a couple of pages.
* add apriori bounds on size in the call sites that do not have those yet -
that'll require comments from maintainers of the code in question in some cases.

Objections?