Re: [RFC] iov_iter_get_pages() semantics

From: Linus Torvalds
Date: Wed Apr 01 2015 - 14:34:21 EST


On Wed, Apr 1, 2015 at 11:26 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> IOW, it's fine to do IO on 'struct page', but it should be
> *controlled* and you damn well need to _own_ that struct page and its
> lifetime, no just "look up random struct page from some kernel
> address".

.. and finally, the reason I feel so strongly about this is that we
*already* had a bug exactly here where you tried to make kvec's "more
generic" and it was just wrong, wrong, wrong. It's just dangerous to
randomly allow these things.

When we map things into user space, such mappings have some rules
about them (ie they have to be full pages, they haev to actually be
special or refcounted etc etc). Kernel mappings simply don't have
those rules. Not even vmalloc, exactly because we sometimes play games
with kernel mappings. But vmalloc space is at least better than random
"just kernel addresses" which is even worse, since they could be stack
pages or SLAB pages or whatever.

And slab allocations, for example, do *not* honor the page count, even
though such pages do have page counts. The slab allocations can be
reused for other things freely, completely regardless of whether
somebody incremented the page count or not.

And yes, people do things like "kernel_read()" on just normal
kmalloc() allocations. So no, I do *not* think that it's ok to "just
make zero-copy kernel_read() work on kernel addresses by turning them
into 'struct page' and then do whop-the-hell-knows-what random things
with such a 'struct page').

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/