Re: [PATCH v3 0/2] iov_iter: allow iov_iter_get_pages_alloc to allocate more pages per call

From: Linus Torvalds
Date: Mon Feb 13 2017 - 16:41:00 EST

On Mon, Feb 13, 2017 at 1:56 AM, Steve Capper <steve.capper@xxxxxxxxxx> wrote:
> Okay so looking at what we have for access_ok(.) on arm64, my
> understanding is that we perform a 65-bit add/compare (in assembler) to
> see whether or not the range is below the current_thread_info->addr_limit.
> So I think this is a roundabout way of checking for no-wrap around and <= TASK_SIZE.

No, that's the problem. It's *not* testing against TASK_SIZE.

Because add_limit is not always TASK_SIZE. When you do
set_fs(KERNEL_DS), you set addr_limit to infinity.

And yes, the kernel does read and write calls too. Seldom, but it
happens. And walking the page tables with kernel addresses is not
supposed to work (sometimes it happens to work by mistake). So if
somebody finds a path that gets from that kind of situation into the
get_user_pages() interface, bad things happen.