Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation

From: Roland Dreier
Date: Mon Apr 25 2005 - 19:03:39 EST


Andrew> Whoa, hang on.

Andrew> The way we expect get_user_pages() to be used is that the
Andrew> kernel will use get_user_pages() once per application I/O
Andrew> request.

Andrew> Are you saying that RDMA clients will semi-permanently own
Andrew> pages which were pinned by get_user_pages()? That those
Andrew> pages will be used for multiple separate I/O operations?

Andrew> If so, then that's a significant design departure and it
Andrew> would be good to hear why it is necessary.

The idea is that applications manage the lifetime of pinned memory
regions. They can do things like post multiple I/O operations without
any page-walking overhead, or pass a buffer descriptor to a remote
host who will send data at some indeterminate time in the future. In
addition, InfiniBand has the notion of atomic operations, so a cluster
application may be using some memory region to implement a global lock.

This might not be the most kernel-friendly design but it is pretty
deeply ingrained in the design of RDMA transports like InfiniBand and
iWARP (RDMA over IP).

I'm also not opposed to implementing some other mechanism to make this
work, but the combiniation of get_user_pages() in the kernel and
extending mprotect() to allow setting VM_DONTCOPY seems to work fine.

- R.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/