Re: [PATCH v3 0/2] iov_iter: allow iov_iter_get_pages_alloc to allocate more pages per call

From: Steve Capper
Date: Mon Feb 13 2017 - 04:56:30 EST

On Fri, Feb 03, 2017 at 11:28:48AM -0800, Linus Torvalds wrote:
> On Fri, Feb 3, 2017 at 11:08 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > On x86 it does. I don't see anything equivalent in mm/gup.c one, and the
> > only kinda-sorta similar thing (access_ok() in __get_user_pages_fast()
> > there) is vulnerable to e.g. access via kernel_write().
> Yeah, access_ok() is bogus. It needs to just check against TASK_SIZE
> or whatever.
> > doesn't look promising - access_ok() is never sufficient. Something like
> > _PAGE_USER tests in x86 one solves that problem, but if anything similar
> > works for HAVE_GENERIC_RCU_GUP I don't see it. Thus the question re
> > what am I missing here...
> Ok, I definitely agree that it looks like __get_user_pages_fast() just
> needs to get rid of the access_ok() and replace it with a proper check
> for the user address space range.
> Looks like arm[64] and powerpc.are the current users. Adding in some
> people involved with the original submission a few years ago.


[ Apologies for my late reply, I was on vacation then catchup... ]

> I do note that the x86 __get_user_pages_fast() thing looks dodgy too.
> In particular, we do it right in the *real* get_user_pages_fast(), see
> commit 7f8189068726 ("x86: don't use 'access_ok()' as a range check in
> get_user_pages_fast()"). But then the same bug was re-introduced when
> the "irq safe" version was merged. As well as in the GENERIC_RCU_GUP
> version.
> Gaah. Apparently PeterZ copied the old buggy version before the fix
> when he added __get_user_pages_fast() in commit 465a454f254e ("x86,
> mm: Add __get_user_pages_fast()").
> I guess it could be considered a merge error (both happened during the
> 2.6.31 merge window).

Okay so looking at what we have for access_ok(.) on arm64, my
understanding is that we perform a 65-bit add/compare (in assembler) to
see whether or not the range is below the current_thread_info->addr_limit.
So I think this is a roundabout way of checking for no-wrap around and <= TASK_SIZE.

Looking at powerpc, I see it's a little different...

So if it sounds reasonable to folk I was going to send a patch to
replace the call to access_ok(.) with a wraparound + TASK_SIZE check
written explicitly in C? (and remove some of the comments talking about