Re: 2.4.0-test10pre5: still IDE lockups on HPT366 controller.

From: Mike Galbraith (mikeg@wen-online.de)
Date: Tue Oct 24 2000 - 21:50:10 EST


On Tue, 24 Oct 2000, Mark Hahn wrote:

> I don't really expect much from my BP6, but:
> -------Sequential Output-------- ---Sequential Input-- --Random--
> -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
> MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU
> 500 24150 100.8 57334 63.5 26323 31.7 28861 91.4 60735 52.0 258.6 2.1
>
> that's on a 128M machine, so the bandwidths are real. that's a
> 3-way raid0 of 15G/platter udma33 disks under test10pre5.
>
> the astonishing thing is that when bonnie's writing, the machine
> wakes bdflush up > 50,000 times per second! so obviously, the
> %CPU numbers are even less meaningful than usual.

Hi Mark,

I've made a little progress fighting with bdflush. Can you please
try this and see if it helps you? I have still to figure out why,
but here, the first bdflush param _must_ be over 75 and under 90
to avoid zillions of context switches. That alone will probably
help enough, but I still think bdflush needs to do what the comments
say too.

I've done a LOT of change_this/change_that, but this is what seems
to help the most. It still starts out context switching like a madman,
but smooths out quickly. I think that this means that memory pressure
needs to be smoothed more frequently (tinker tinker tweak twiddle)
If you fiddle with the bdflush numbers a little, you'll see what I
mean with vmstat.

        -Mike

echo 80 64 128 256 500 3000 500 1884 2 > /proc/sys/vm/bdflush

--- mm/vmscan.c.org Wed Oct 25 03:49:45 2000
+++ mm/vmscan.c Wed Oct 25 03:50:30 2000
@@ -649,7 +649,7 @@
                         spin_unlock(&pagemap_lru_lock);
 
                         /* Will we do (asynchronous) IO? */
- if (sync && launder_loop && maxlaunder == 0)
+ if (sync && launder_loop && maxlaunder-- > 0)
                                 wait = 2; /* Synchrounous IO */
                         else if (launder_loop && maxlaunder-- > 0)
                                 wait = 1; /* Async IO */
--- fs/buffer.c.org Tue Oct 24 16:18:50 2000
+++ fs/buffer.c Tue Oct 24 18:02:25 2000
@@ -2640,7 +2640,7 @@
 int bdflush(void *sem)
 {
         struct task_struct *tsk = current;
- int flushed;
+ int flushed = 0;
         /*
          * We have a bare-bones task_struct, and really should fill
          * in a few more things so "top" and /proc/2/{exe,root,cwd}
@@ -2664,9 +2664,22 @@
         for (;;) {
                 CHECK_EMERGENCY_SYNC
 
- flushed = flush_dirty_buffers(0);
+ /*
+ * flush_dirty_buffers() will flush no more than ndirty buffers
+ * at one time.
+ */
+ flushed += flush_dirty_buffers(0);
                 if (free_shortage())
                         flushed += page_launder(GFP_BUFFER, 0);
+ /*
+ * If there are still a lot of dirty buffers around,
+ * skip the sleep and flush some more. Otherwise, we
+ * go to sleep waiting a wakeup.
+ */
+ if (flushed < bdf_prm.b_un.nrefill && balance_dirty_state(NODEV) >= 0) {
+ run_task_queue(&tq_disk);
+ continue;
+ }
 
                 /* If wakeup_bdflush will wakeup us
                    after our bdflush_done wakeup, then
@@ -2678,15 +2691,8 @@
                    deadlock in SMP. */
                 __set_current_state(TASK_INTERRUPTIBLE);
                 wake_up_all(&bdflush_done);
- /*
- * If there are still a lot of dirty buffers around,
- * skip the sleep and flush some more. Otherwise, we
- * go to sleep waiting a wakeup.
- */
- if (!flushed || balance_dirty_state(NODEV) < 0) {
- run_task_queue(&tq_disk);
- schedule();
- }
+ flushed = 0;
+ schedule();
                 /* Remember to mark us as running otherwise
                    the next schedule will block. */
                 __set_current_state(TASK_RUNNING);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Oct 31 2000 - 21:00:15 EST