On Wednesday 09 April 2008 05:49:15 Max Krasnyansky wrote:Hmm, the version that I sent you awhile ago (remember I sent you an attachment with prototype of the new tun driver and user space code) was not that bad in that area. It mean it did not touch existing read()/write() path. The difference was that it allocated the rings and the data buffer in the kernel and mapped into the user-space. Which is not what you guys need but that's a separate thing.Rusty Russell wrote:This patch modifies tun to allow a vringfd to specify the receiveIn general the code looks good. The only thing I could not convince myself
buffer. Because we can't copy to userspace in bh context, we queue
like normal then use the "pull" hook to actually do the copy.
More thought needs to be put into the possible races with ring
registration and a simultaneous close, for example (see FIXME).
We use struct virtio_net_hdr prepended to packets in the ring to allow
userspace to receive GSO packets in future (at the moment, the tun
driver doesn't tell the stack it can handle them, so these cases are
never taken).
in is whether having generic ring buffer makes sense or not.
At least the TUN driver would be more efficient if it had its own simple
ring implementation. Less indirection, fewer callbacks, fewer if()s, etc.
TUN already has the file descriptor and having two additional fds for rx
and tx ring is a waste (think of a VPN server that has to have a bunch of
TUN fds). Also as I mentioned before Jamal and I wanted to expose some of
the SKB fields through TUN device. With the rx/tx rings the natural way of
doing that would be the ring descriptor itself. It can of course be done
the same way we copy proto info (PI) and GSO stuff before the packet but
that means more copy_to_user() calls and yet more checks.
So. What am I missing ? Why do we need generic ring for the TUN ? I looked
at the lguest code a bit and it seems that we need a bunch of network
specific code anyway. The cool thing is that you can now mmap the rings
into the guest directly but the same thing can be done with TUN specific
rings.
I started modifying tun to do this directly, but it ended up with a whole heap of code just for the rings, and a lot of current code (eg. read, write, poll) ended up inside an 'if (tun->rings) ... else {'. Having a natural poll() interface for the rings made more sense, so being their own fds fell out naturally.
I decided to float this version because it does minimal damage to tun, and II see.
know that other people have wanted rings before: I'd like to know if this is
likely to be generic enough for them.