real POSIX.1b semaphores

Ulrich Drepper (drepper@myware.rz.uni-karlsruhe.de)
19 Nov 1996 04:56:43 +0100


Hi,

I thought a bit about real POSIX.1b semaphores. Since the defined
interface allows global semaphores we clearly need kernel support.

For the sake of speed we should try to be able to handle as much
as possible on user level. The both kinds a POSIX semaphore can
be reduced to one form. I thought about the following but please
don't take this as a well-thought design. I have not much experience
with kernel programming.

- reserve two pages in the mmap area of each process. The first page
is for global locks, ie.e., the very same page is shared by all
processes. The second page is shared between all threads of a
process (i.e., clone needs another flag).

The pages should be write protected.

- each semaphore uses sizeof(unsigned int) bytes, normally 4.

- one possibility to create a semaphore is to call sem_init(). This
will have to be a syscall. It returns a pointer to an address in
either of the semaphore spaces, depending on the value of the PSHARED
argument. The initial value is given by the third parameter.

- removing the semaphore need not be explained. There must in any case
be an allocation bitmap or something like this. This function again
needs a syscall.

- Creating a semaphore using sem_open() requires to have another resource,
a pseudo file system where the names are allocated. The standard leaves
it open whether this names are visible in the normal filesystem but I
think it is very useful.

Would it be difficult to extend the proc/ filesystem to dynamically
create a hierachy of dirs for the semaphores?

In any case each known name must reference an address in the global
semaphore page. Using sem_open() with an unknown semaphore and O_CREAT
will allocate a new semaphore, just like sem_init(). The return value
of this function is again a pointer and so all the other sem_*()
functions can handle semaphores created by sem_init() and sem_open()
in the same way.

- calling sem_close() will of course decrement the reference count, but
not from the pseudo-filesystem if it is zero. There is another function
sem_unlink() which will remove the semaphore.
This again requires a syscall or two.

- the function sem_getvalue() is implemented trivially by reading the
addressed memory location of the semaphore. No kernel action is
required.

- the function sem_wait(), sem_trywait(), and sem_post() will write to
the memory for the semaphore and cause a bus error. Now the action
can be performed be the kernel.

I don't know whether it is possible to work with one single memory
location for the action and decode the instruction or (unlikely).
Perhaps using two or three word for each semaphores is necessary.
So it could be decided based on the address chosen. It would be up
to the library to choose the correct location.

Would this be reasonable? Does anybody plan to do something like
this? We really should have this soon now that multi-threading
only has limits in the kernel. The libpthread and the libc (=glibc)
are ready.

If anybody has interest I could provide the informations from
the standard. Thanks to RedHat I now have the full POSIX.1 standard.

(PS: A second *very* important task would be to implement CLONE_PID.)

Thanks,

-- Uli
--------------. drepper@cygnus.com ,-. Rubensstrasse 5
Ulrich Drepper \ ,--------------------' \ 76149 Karlsruhe/Germany
Cygnus Support `--' drepper@gnu.ai.mit.edu `------------------------