Re: [RFC] iov_iter_get_pages() semantics

From: Linus Torvalds
Date: Wed Apr 01 2015 - 12:45:35 EST


On Tue, Mar 31, 2015 at 7:33 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>>
>> So the whole "get_page()" thing is broken. Iterating over pages in a
>> KVEC is simply wrong, wrong, wrong. It needs to fail.
>>
>> Iterating over a KVEC to *copy* data is ok. But no page lookup stuff
>> or page reference things.
>
> Hmm... FWIW, for ITER_KVEC the underlying data would bloody better not
> go away anyway - vmalloc space or not. Protecting the object from being
> freed under us is caller's responsibility and caller can guarantee that.
> Would a variant that does kmap_to_page()/vmalloc_to_page() _without_
> get_page() for ITER_KVEC work sanely?

No.

It's insane to iterate over 'struct page' pointers, whether you do
get_page or not.

Why?

The *only* thing you can do with those pages is to copy the content.
You must never *ever* do anything else. And if the caller only copies
the content, then the caller is *wrong* to use the page-iterators.

It really is that simple. Either the caller cares about just the
content - in which case it should use the normal "iterate over
addresses" - or the caller somehow cares about 'struct page' itself,
in which case it is *wrong* to do it over vmalloc space or random
kernel mappings.

You cannot have it both ways.

In fact, even _if_ the caller just does a "kmap()" and then looks at
the contents, it is very questionable to allow it for kernel or
vmalloc data. Why? Because we actually do things like mark vmalloc
areas read-only for module code etc, and using kmap() on the pages is
just a way for bad users to more easily overcome things like that by
mistake.

So it's simply wrong in so many ways. There is absolutely *zero*
excuse to look at "struct page" for random kernel data. You had better
get the struct page some valid way - either by explicitly allocating
the page as such, or by looking it up from a *valid* source (ie
looking up a page range from a user mapping).

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/