Re: mapping user space buffer to kernel address space

From: Andrea Arcangeli (andrea@suse.de)
Date: Mon Oct 16 2000 - 18:18:33 EST


On Mon, Oct 16, 2000 at 03:21:11PM -0700, Linus Torvalds wrote:
> Pinning will not happen.

Pinning happens every day on my box while I use rawio.

If you want to avoid pinning _userspace_ pages then we should delete
map_user_kiobuf and define a new functionality and API to replace RAWIO for
DBMS that must bypass the kernel cache while doing I/O to disk and possibly
providing zero copy below highmem as rawio does. (later on we should
do the same for networkng)

IMHO rawio is providing the above necessary functionality with a
sane userspace interface.

> (And remap_page_range() has nothing to do with pinning - they are just
> pages that cannot be swapped out because they are not normal pages at
> all).

They are _normal_ pages allocated by a device driver and made temporarly
visible to userspace mapping them into the virtual address space of the process
_after_ "pinning" them using the PG_reserved bitflag. If we wouldn't pin them
too, they would be unmapped as well as soon as they're visible in the virtual
address space of the process.

I don't think the thing is much different. The main difference I can see is the
one that was buggy: that is remap_page_range doesn't have to care that the page
stays there while pinning it, because before pinning it it's still private and
not visible by the VM (that's why it's much simpler). map_user_kiobuf instead
is more complex because it must make sure that the page stays there while we
pin it (and that should be fixed now).

I hope we're not getting confused by the term "pin". With "pin" I always meant
to avoid the userspace-visible page to go away from under us while we use it
from kernel space because of underlying VM activities. I don't see any
other possible meaning for "pin" in the context of map_user_kiobuf.

(in journaling we instead use "pin" to mean a page that can't be freed but that
also can't be written before some other transaction is committed to disk)

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Oct 23 2000 - 21:00:10 EST