From: prasanna@veritas.com (Prasanna Narayana)
Hi,
Do you mind posting this to linux-kernel email list ?
Somehow we are unable to do that.
-- Prasanna
---------------------------- CUT HERE -----------------------------------
Subject: 2.2.14 SMP add_wait_queue problem
Hi,
We are facing a hard hang on 2.2.14 SMP kernel while doing heavy
multi-threaded i/o using our software raid driver.
We have 10 kernel threads which mostly do
get my_spinlock (spin_lock_irqsave)
do some work
add_wait_queue
release my_spinlock
schedule()
get my_spinlock
remove_wait_queue
and repeat the cycle.
These are woken up from the i/o done path.
Within a few seconds machine locks up and nothing works. By putting
printk statements, we have found that
cpu 1 cpu 2
----- ------
* get my_spinlock (spin_lock_irqsave) * waiting for my_spinlock
* do some work with spin_lock_irqsave()
* trying to do add_wait_queue which also means disabled
but has not completed interrupts.
probably because of not
getting waitqueue_lock
Apart from add_wait_queue(), waitqueue_lock is used only
by remove_wait_queue() and __wake_up(). So why doesn't the thread
running on cpu 1 get this lock when cpu 2 is not executing
either remove_wait_queue() or __wake_up() ?
This is not a problem on UP and works ok on
2.3.xx, (wait queue implementation is different there).
Also, it does not look like a hardware problem as we see this
in 3 different machines.
-- Prasanna
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Fri Apr 07 2000 - 21:00:16 EST