Re: [patch 0/9] mutex subsystem, -V4

From: Ingo Molnar
Date: Sun Dec 25 2005 - 18:05:24 EST



* Roman Zippel <zippel@xxxxxxxxxxxxxx> wrote:

> > c) semaphores are total overkill for 99% percent of the users. Remember
> > this thing about optimizing for the common case?
>
> [...] I also haven't hardly seen any discussion about why semaphores
> the way they are. Linus did suspect there is a wakeup bug in the
> semaphore, but there was no conclusive followup to that.

no conclusive follow-up because ... they are too complex for people to
answer such questions off the cuff? Something so frequently used in
trivial ways should have the complexity of that typical use, not the
complexity of the theoretical use. There is no problem with semaphores,
other than that they are not being used as semaphores all that often.

for which i think there is a rather simple practical reason: if i want
to control a counted resource within the kernel, it is rarely the
simplest solution to use a semaphore for it, because a semaphore cannot
be used to protect data structures in the 'resource is available' stage
[i.e. when the semaphore count is above zero]. It does decrement the
counter atomically, but that is just half of the job i have to do.

to control (allocate/free) the resource i _have to_ add some other
locking mechanism anyway in most cases (a spinlock most likely, to
protect the internal list and other internal state) - at which point
it's simpler and faster to simply add a counter and a waitqueue to those
existing internal variables, than to add a separate locking object to
around (or within) the whole construct.

semaphores would be nice in theory, if there was a way to attach the
'decrement counter atomically' operation to another set of atomic ops,
like list_del() or list_add(), making the whole thing transactional.
[this would also be a wholly new API, so it only applies to semaphores
as a concept, not our actual semaphore incarnation] So i see the
theoretical beauty of semaphores, but in practice, CPUs force us to work
with much simpler constructs.

there are some exceptions: e.g. when the resource is _nothing else_ but
a count.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/