Re: [PATCH 1/2] fs: introduce sendfd() syscall

From: Alex Dubov
Date: Tue Dec 02 2014 - 23:14:59 EST

On Wed, Dec 3, 2014 at 2:40 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Wed, Dec 03, 2014 at 01:22:33PM +1100, Alex Dubov wrote:
>> On a less related note, I hope you will agree that the simpler
>> mechanism for this very in-demand feature is long overdue on Linux
>> (every man and his dog are passing fds around these days).
> ... and I'm less than sure that it's a good thing. If nothing else,
> once the pieces of your program are passing descriptors around freely,
> you have created a barfball that will be impossible to split between
> several boxen if you run into scalability issues. Descriptor-passing
> is limited to a single system; you *can't* do that between e.g. components
> of a cluster. So it's not an unmixed blessing, just as overuse of
> shared memory segments, etc. They do have their uses, but that needs
> to be carefully considered every time, or you'll create a major headache
> a few years down the road.

Well, if you try hard enough, you can pass fds around the components
of the cluster - Mosix was doing just that some 10 years ago.
Conceptually, it's even easier than doing distributed shared memory,
as long as mmap is not concerned. :-)

I was, however, looking at it from a different standpoint. Abundance
of cores in the contemporary CPUs calls for locally parallel
applications (and those are still the majority - clearly 90% of the
applications and their workloads fit just fine on a single node).

Thus, any modern application developer faces the usual dilemma:

1. Go multi-threaded - easy inter-thread IPC, lousy reliability with
minor errors in secondary tasks crashing the whole application.

2. Go multi-process - circus hoop jumping when IPC is concerned, great
reliability through OS provided fault isolation (so even really broken
stuff, like PHP plugin for apache manages to perform most of the time

Memfd (on its own) and eventfd are great steps in the right direction,
as managing persistent shmem and sem objects was always pain in the
arse. If there was an alternative to AF_UNIX fd passing, with its
arcane API, fs persistence and mind boggling fd recursion bugs, then
option 2 would became much more attractive for developers leading to
over-all increase in application reliability and security.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at