The following patch fixes a busy-loop deadlock in kflushd(). This
deadlock was probably only triggered under rare conditions (if ever),
but as a side effect of my recent buffer.c patch, it is more common.
The deadlock can occur if kflushd() finds all the buffers (most
likely just one :-) on the dirty list to be both dirty and locked:
if (no buffers written && nr_buffers_type[BUF_DIRTY] > 0)
don't sleep and continue looping;
Usually, a buffer is locked and marked clean by ll_rw_blk.c at once
but in case the request queue is full, we will sleep between locking
the buffer and refiling it.
During the above sleeping, in the unfortunate case in which the buffer
was submitted by another task and kflushd() is awaken and finds no other
dirty + non-locked buffers, we will enter an infinite busy loop.
Gadi
--- vpre-2.0.31-7/linux/fs/buffer.c Mon Aug 18 05:49:09 1997
+++ linux/fs/buffer.c Mon Aug 18 05:45:56 1997
@@ -1717,7 +1717,7 @@
* dirty buffers, then make the next write to a
* loop device to be a blocking write.
* This lets us block--which we _must_ do! */
- if (ndirty == 0 && nr_buffers_type[BUF_DIRTY] > 0) {
+ if (ndirty == 0 && nr_buffers_type[BUF_DIRTY] > 0 && wrta_cmd != WRITE) {
wrta_cmd = WRITE;
continue;
}
@@ -1725,7 +1725,7 @@
/* If there are still a lot of dirty buffers around, skip the sleep
and flush some more */
- if(nr_buffers_type[BUF_DIRTY] <= nr_buffers * bdf_prm.b_un.nfract/100) {
+ if(ndirty == 0 || nr_buffers_type[BUF_DIRTY] <= nr_buffers * bdf_prm.b_un.nfract/100) {
wake_up(&bdflush_done);
current->signal = 0;
interruptible_sleep_on(&bdflush_wait);