Re: [patch] releasing kernel lock during copy_from/to_user

Manfred Spraul (manfreds@colorfullife.com)
Mon, 24 May 1999 02:11:25 +0200


Andrea wrote:
> If another CPU got the lock while we was in copy_from/to_user or in clear
> page it means that we are been right releasing the lock.
lock ping-pong doesn't increase the scalability:
Transfering the lock will take around 50-100 ticks:
the complete cache line must be written & read by the other CPU
(I'm not sure, some CPU's implement a CPU->CPU cache line transfer)

If we release the lock for less than 100 ticks
(e.g copy 50 bytes which are in the L2-cache), then we loose CPU time:
without release:
CPU 1:
...
copy from user (100 ticks)
...

CPU 2 : spins the whole time
--> one CPU is waiting
with release:
CPU 1:
unlock_kernel, transfers the cache line. (100 ticks)
<< CPU 2 can start
copy from user (100 ticks)
<< the CPU spins.

CPY 2:
spins,
reads the cache line, restarts.

As you can see, we loose 100 ticks to transfer the cache line,
then both CPU operate in parallel for 100 ticks,
the CPU 1 spins.
--> Sum: -100+100=0.
Later the lock will be transfered back to CPU 1.
--> overall we lose.

We only scale better if we release the lock far longer than
the time required to transfer the lock from 1 CPU to the other
CPU. I'd say we should release the lock only if we assume that
the other CPU has a good chance to release the lock before
we need it again (~1000 ticks, or 300 bytes uncached memmove,
or 500 bytes uncached memset)

> If the code that will run is very fast then we'll have less probabilty to
> scale and we risk to release the kernel lock without really improve the
> other CPU.
That's the second reason why we should release the lock only if we
assume that more than a certain minimum time is required for
the operation.

Tomorrow, I'll download your latest patch, and I'll modify my
RDTSC code so that I can measure the individual functions from
uaccess.h.

My current guess:
- the string routines are to fast--> do not unlock
- clear_page(): 4000-6000 ticks --> unlock
- copy_to/from_user(): release if more than (n) bytes
- clear_user(): ??
- cksum...(): probably, perhaps if more than (x) bytes.

--
	Manfred
Note: I've read that patches are on the way to Linus for 2.3.3 which
make the complete page-cache parallel on SMP [from Ingo Molnar].

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/