Re: Idea to make a kickass TCP read

Andi Kleen (ak@muc.de)
07 Aug 1998 15:32:03 +0200


Malcolm Beattie <mbeattie@sable.ox.ac.uk> writes:
>
> On the "SHM segment handle" side of things, I often feel that making
> vma descriptors available at user level like file descriptors would be
> a nice idea. Currently, file descriptors are "first class objects" at
> user level and can be passed between processes via Unix domain
> sockets. However, for some things it would be nice to have "mmap"
> granularity where you map a region of memory (visible kernel-side as a
> vma) and get a descriptor for it. Then you could set protections on it
> (say) and "send" it to another process. That process could map it via
> the descriptor. Note that SysV shm handles aren't quite the same:
> they're equivalent to sending a filename to another process since the
> other process has to "reopen" it and the permissions check happens at
> that point. A first class vma descriptor would be the sort of thing
> you'd want to pass to your pseudo-read call. Whether or not the
> networking side could make use of it with its hardware, the semantics
> would mean that the kernel would be allowed to optimise any transfers
> however it liked because it knows the entire object it's working on.
> For passing vma descriptors to other processes, an analogue of
> SCM_RIGHTS would do the trick.

I implemented this. You shared mmap() a specific range on a FD, and then
pass the FD together with a token describing the range over a unix socket.
The receiver then mmap()s the range too. Once the fd is passed you can
mmap() further regions too - just send a small unix message describing the
range, but don't pass a new fd.

Unfortunately it only works with named files, not with anonymous swap
space (just opening /dev/zero does not work, you need a specific
file). Using the new SCM_CREDENTIALS auxilliary message in 2.1 it is
even possible to do (uid, gid) authentification.

I wrote a small library to encapsulate this scheme nicely. It uses
a simply buddy allocator to manage the 'virtual space' in a file. To make
sure that the file is deleted when the process gets killed it is simply
deleted immediately after creation.

It would be nice if it was possible to do the same with anonymous memory,
unfortunately shared anonymous mapping didn't make it into 2.1 - with them
it would be easily possible to create a special device similar to /dev/zero
that supports shared mappings per fd.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html