Re: > Re: Linux threads -- as seen in NT Magazine

Paul Barton-Davis (pbd@Op.Net)
Tue, 15 Dec 1998 10:32:57 -0500


>> P B-D writes:
>> In the general case, the kernel cannot determine which N threads of a
>> set of M threads (where M > N) are the best to run at a given point in time.
>>
>> "best" here doesn't refer to goodness(), but to user-level application
>> performance. The pipeline David referred to may well be implemented
>> entirely at user-level, using UL spin-locks that are invisible to the
>> kernel. There is therefore no way for the kernel to ensure correct
>> scheduling. All it knows is that a thread called sched_yield(), but
>> has no idea why, or under what circumstances it might want to run
>> again.
>
>The spinlocks in LinuxThreads are a hybrid user/kernel entity. They
>(IIRC) first spin a few times, reschedule a few times, then they go to
>sleep. In the example cited above, why won't you get the desired
>behaviour with this mechanism?

I'm not sure I can claim to completely understand the following code,
but I believe that on SMP the behaviour is not as you describe above.

<from include/asm/spinlock.h (i386)>

#define spin_lock_string \
"\n1:\t" \
"lock ; btsl $0,%0\n\t" \
"jc 2f\n" \
".section .text.lock,\"ax\"\n" \
"2:\t" \
"testb $1,%0\n\t" \
"jne 2b\n\t" \
"jmp 1b\n" \
".previous"

#define spin_unlock_string \
"lock ; btrl $0,%0"

#define spin_lock(lock) \
__asm__ __volatile__( \
spin_lock_string \
:"=m" (__dummy_lock(lock)))

So, on an SMP machine, there is no sleep or yield at all. I think.

If the thread actually sleep(2)'s on some event, fine. But most
user-level threads use user-level events (i.e. pthread_cond_wait()),
this ultimately resolves to a sched_yield() call, not a call to
sleep().

User level threads do not, as a rule, use kernel scheduling and
synchronization primitives. If they do, there is very little advantage
to user level threads - you're better off just using clone(), and IPC
semaphores.

>Process 1 unlocks process 3, then locks waiting for 3 to finish. When
>process 1 yields, process 3 will be run (since process 2 is on the
>other CPU).

As explained above, in an SMP system with genuine user-level threads,
the "locking" and "unlocking" you're referring to isn't visible to the
kernel.

>> current implementation is even close to "correct" for multithreaded
>> applications is a joke. Fortunately, it just so happens to work for
>> most of the time :)
>
>Certainly has for me :-)

Are you running on an SMP machine ? Do you have a threaded, order
dependent pipeline that uses user-level threads and no kernel-level
synchronization ?

--p

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/