SMP: unnecessary scheduling when using pipes [2.3.*]

From: Martin Schenk (schenkm@ping.at)
Date: Fri Feb 25 2000 - 14:47:08 EST


short summary:
The semaphore PIPE_SEM(*inode) is only taken for very brief amounts
of time, all real waiting is done with PIPE_WAIT. On a SMP system
replacing this semaphore with a spinlock avoids unnecessary scheduling.

long story:
While experimenting with the lockmeter-patch (to measure spinlock wait
and hold times), I got the following strange results:

CON WAIT TOTAL NOWAIT SPIN
58% 1.4us 22508 9657 12851 __down_interruptible+0x390
27% 0.7us 22508 16588 5920 __down_interruptible+0xb8
 
there was extremely high contention on the wait_queue lock
[0x390 = remove_wait_queue, 0xb8 = add_wait_queue_exclusive]
of a semaphore, due to contention on the semaphore itself.

All these calls to __down_interruptible where due to contention
on PIPE_SEM in pipe_read and pipe_write.

As PIPE_SEM is only held for a very short time, the "semaphore-overhead"
of:
        - Adding the waiting process to a waitqueue
        - schedule
        in __down_interruptible()
        - wake up the process
        in __up()
        and finally
        - remove the process from the waitqueue
        in down__interruptible() again
seems excessive [especially if bad timing leaves you spinning when
accessing the waitqueue]

I made a small patch that adds a spinlock_t variable to pipe_inode_info
and uses this spinlock to protect write and read. In all other functions
that do a down(PIPE_SEM), this spinlock is first taken and released
after
calling up(PIPE_SEM).

Can anyone tell me whether locking PIPE_SEM (which is just inode->i_sem
in disguise) is necessary for synchronizing with other parts of the
kernel that lock i_sem ?

If someone is interested in the patch, email me.

To reproduce the problem, just run a process writing to a pipe while
another
process is reading from it.
example: pipedis from the IX_SSBA benchmark
"pipedis 500000 1" takes ~5.7sec with 2.3.46 on my dual PIII, with the
patch
it just takes ~1.9sec [2.2.14 takes ~2.1sec]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Feb 29 2000 - 21:00:13 EST