Re: [PATCH] Lightweight userspace semaphores...

From: Hubertus Franke (frankeh@watson.ibm.com)
Date: Sat Mar 02 2002 - 09:08:56 EST

Next message: kuznet@ms2.inr.ac.ru: "Re: OOPS: Multipath routing 2.4.17"
Previous message: rob@mur.org.uk: "strange hang with promise ide and 2.4.18"
Next in thread: Paul Jackson: "Re: [Lse-tech] Re: [PATCH] Lightweight userspace semaphores..."
Reply: Paul Jackson: "Re: [Lse-tech] Re: [PATCH] Lightweight userspace semaphores..."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Mar 01, 2002 at 09:07:18PM +0100, Martin Wirth wrote:
>
>
> Hubertus Franke Worte:
>
> >
> >So, it works and is correct. Hope that clarifies it, if not let me know.
> >Interestingly enough. This scheme doesn't work for spinning locks.
> >Goto lib/ulocks.c[91-133], I have integrated this dead code here
> >to show how not to do it. A similar analysis as above shows
> >that this approach wouldn't work. You need cmpxchg for these scenarios
> >(more or less).
> >
>
> You are right, I falsely assumed the initial state to be [1,1].
>
> But as mentioned in your README, your approach is currently is not able
> to manage signal handling correctly.
> You have to ignore all non-fatal signals by using ESYSRESTART and a
> SIG_KILL sent to one of the processes
> may corrupt your user-kernel-syncronisation.
>
> I don't think a user space semaphore implementation is acceptable until
> it provides (signal-) interruptability and
> timeouts. But maybe you have some idea how to manage this.
>
> Martin Wirth
>
>

As of the signal handling and ESYSRESTART.
The user code on the slow path can check for return code and
has two choices (a) reenter the kernel and wait or (b) correct the
status word, because its still counted as a waiting process and return
0. I have chosen (a) and I automated it.
I have tried to send signals (e.g. SIGIO) and it
seems to work fine. The signal is handled in user space and returns back
to the kernel section. (b) is feasible but a requires a bit more exception
work in the routine.
As of the corruption case. There is flatout (almost) nothing you can do.
For instance, if a process graps the locks and exits, then another process
tries to acquire it you got a deadlock. At the least on the mailing list
most people feel that you are dead in the water anyway.

I have been thinking about the problem of corruption.
The owner of the lock could register its pid in the lock aread (2nd word).
That still leaves non-atomic windows open and is a half-ass solution.

But more conceptually, if you process dies while holding a lock that protects
some shared data, there is no guarantee that you can make about the data
itself if the updating process dies. In that kind of scenario, why
trying to continue as if nothing happened.
It seems the general consent on the list is that your app is toast at that
point.

-- Hubertus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: kuznet@ms2.inr.ac.ru: "Re: OOPS: Multipath routing 2.4.17"
Previous message: rob@mur.org.uk: "strange hang with promise ide and 2.4.18"
Next in thread: Paul Jackson: "Re: [Lse-tech] Re: [PATCH] Lightweight userspace semaphores..."
Reply: Paul Jackson: "Re: [Lse-tech] Re: [PATCH] Lightweight userspace semaphores..."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Thu Mar 07 2002 - 21:00:23 EST