problem with iowait (disk I/O) in 2.4.21 kernel

From: Suman Puthana
Date: Mon Dec 01 2003 - 02:44:39 EST


Hello,

Is anybody aware of any changes in the 2.4.21 kernel which affects disk I/O
badly? We are noticing that the cpu taken up by iowait's is causing
significant degradation in disk performance.

We have an application in which there are multiple processes writing to disk
simultaneously, each process writing anywhere from 16 - 64 K Bytes at a time
and sleeping for about 50 ms depending on how long the write() call takes to
finish. It is important that the write() finishes within 50 ms for this
application. The file's are opened with the open() call with the standard
O_RDWR and O_CREAT flags.

On the 2.4.18 kernel, 25 of these processes take up about 15% of CPU on a
dual Xeon Pentium 4 2.4 GHz processor with 1 MB of memory for a total
throughput of about 7 MBps which is very good. The write() calls typically
take less than 5 ms to complete and we sleep for the remaining 45 ms or so.

But on the same system when we installed the 2.4.21 kernel, these 25
processes take anywhere between 40 - 70 % of the CPU depending on how much
CPU the iowait is taking up. The iowait part of it varies all the way from
10 - 60 %. (The iowait CPU is within 1% on the 2.4.18 kernel.). What is
even worse - sometimes the write() calls take anywhere between 1 - 2
seconds( yes seconds!) to return, which degrades the performance of our
application server pretty badly.

It happens on all kinds of storage devices so it's definitely not a
hardware problem.(We tested on the standard IDE disk all the way upto the
EMC clariion storage system).

Is this iowait something which was introduced in the 2.4.21 kernel core code
or in the file system driver code? It definitely happens with jfs, reiserfs
and also on ext3 though not as badly . Is there a way to turn it off or to
make it take up lesser CPU?

Any insight or pointers regarding this problem would be greatly
appreciated. Thank you for your support.

- Suman Puthana

PS - Something else of interest here is that when the iowait part of the CPU
increases, iostat shows a significantly higher throughput than what we are
actually trying to write to disk. (When sampled over 30 seconds, when we are
trying to write 7 MBps, iostat actually shows 11-12 MBps when iowait goes
upto 30-40 %). It's almost like there's some internal copies going on.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/