Re: To be smug, or not to be smug, that is the question

Henrik Olsen (henrik@iaeste.dk)
Tue, 19 Jan 1999 21:42:00 +0000 ( )


On Tue, 19 Jan 1999, Alain Williams wrote:
> On Mon, Jan 18, 1999 at 10:48:11PM -0800, Dan Kegel wrote:
> > "Albert D. Cahalan" <acahalan@cs.uml.edu> writes:
> > > Blocking system calls were a bad idea. Signals were added to unix
> > > to address the lack of a general event queue. Since longjump won't
> > > get you out of one of those crummy blocking system calls, some
> > > fool made signals interrupt system calls. As a patch on top of
> > > a patch on top of a patch, app programmers need to wrap system
> > > calls in loops. Patching the brokenness even more, we see Netscape
> > > talking to itself to get around a stupid race condition. Since
> > > the unixy API does not support dispatching concurrent system calls,
> > > someone added the aio_* functions to "fix" it for the limited case
> > > of simple disk IO. All along the way people find hacks for their
> > > own immediate problem rather than fixing the API.
> >
> > I think he hit a nail on the head somewhere there...
> > a single uniform mechanism for waiting for stuff was
> > what I liked so much about select(), but now I see it
> > isn't up to the task. Sun had a modified select() back in
> > SunWindows days to allow you to wait for UI events, too;
> > I wonder if there's an elegant generalization to select()
> > that could be useful in the future?
>
> AIX select() allows you to wait on message queues as well as FDs.
>
> One thing that I have always thought would be useful is some kind of
> `serialise_access()', this would allow one process after another to lock
> some shared resource -- in the order that they joined the queue. Part
> of the purpose is to remove the wake_everything_up approach that leads
> to the ``thundering herds'', but also means that it is possible that
> something waits a *very* long time before it gets it's turn on a popular
> resouce.
>
> --
> Alain Williams

This thundering herd "problem" can fairly easily be programmed around
without messing around with the select() semantics by letting everything
but one process wait on a semaphor instead of on the select, this ensures
only two proceses wake up, the one waiting on the select, and (shortly
after) the first one waiting on the semaphore once the first one releases
it.
This is part of the way Apache gets around thundering herds.

Of cause this can be seen as yet another hack to get around a problem in
the API, but changing a standard API is very definitely not the way to
solve problems, that only deadends in something like the way Windows
breaks applications left and right every time they "upgrade" the system.

Henrik

-- 
Henrik Olsen,  Dawn Solutions I/S
URL=http://www.iaeste.dk/~henrik/
Get the rest there.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/