Re: Degrading disk read performance under 2.2.16

From: Corin Hartland-Swann (cdhs@commerce.uk.net)
Date: Fri Aug 11 2000 - 11:11:59 EST


Mark,

I'm cross-posting this to the kernel and RAID lists again, because there's
more information about the problem in here. I hope this is acceptable.

On Thu, 10 Aug 2000, Mark Hahn wrote:
> > Intel 810E Chipset Motherboard (CA810EAL), Pentium III-667, 32M RAM,
> > Maxtor DiamondMax Plus 40 40.9GB UDMA66 Disk, Model 54098U8
>
> a low-memory machine, by today's standards. with one of the fastest
> disks available. and, for that matter, a pretty fast CPU.

The RAM is deliberately lowered - it normally has 128M. I lowered it so
that I could benchmark with 256M file size, instead of the 1G or so it
would take to get decent results.

The idea behind the spec was to use really quite currrent components,
fast processor and disk to stress the kernel as much as possible and
reveal bottlenecks.

> > I used tiotest to benchmark, using a file size of 256MB, block size of 4K,
> > and with 1, 2, 4, 16, 32 threads. The performance starts to get hit as

I forgot to add that I ran each test five times so as to get consistent
results.

> does larger blocksizes change the picture at all? I'm wondering whether
> readahead is effecting things (shouldnt, but...)

mke2fs only allows block sizes of 1K, 2K or 4K, so I can't make the
blocksize any larger...

> > ==> 2.2.16 <==
> > Dir Size BlkSz Thr# Read (CPU%) Write (CPU%) Seeks (CPU%)
> > ----- ------ ------- ---- ------------- -------------- --------------
> > /mnt/ 256 4096 1 27.0948 10.5% 25.9272 22.5% 147.264 0.83%
> > /mnt/ 256 4096 2 20.5753 8.42% 25.9316 22.7% 143.113 0.56%
> > /mnt/ 256 4096 4 13.0397 6.09% 25.7723 22.7% 143.619 0.51%
> > /mnt/ 256 4096 16 9.38573 5.40% 25.6285 23.1% 147.815 0.51%
> > /mnt/ 256 4096 32 6.96358 5.90% 25.2889 22.9% 151.182 0.57%
>
> so when this is running, does "vmstat 1" indicate that the system's
> swapping? I'm guessing so, that it's trying to scavenge pages from
> live processes, ineffectually (since they're live), and thrashing.

After going to all the trouble of testing this out, I remembered that I
had not allocated any swap so that it couldn't influence the results at
all! d'oh!

> > I also tested 2.4.0-test5, and performance there sucks with one thread,
> > but doesn't degrade any more quickly than 2.2.15
> >
> > ==> 2.4.0-test5 <==
> >
> > Dir Size BlkSz Thr# Read (CPU%) Write (CPU%) Seeks (CPU%)
> > ----- ------ ------- ---- ------------- -------------- --------------
> > /mnt/ 256 4096 1 8.15398 49.6% 7.64104 71.2% 125.675 5.90%
> > /mnt/ 256 4096 2 8.01528 56.1% 7.63085 69.8% 123.076 5.59%
> > /mnt/ 256 4096 4 7.79720 61.6% 7.62060 68.1% 125.437 5.75%
> > /mnt/ 256 4096 16 7.64617 65.2% 7.60466 62.5% 129.431 5.83%
> > /mnt/ 256 4096 32 7.60580 67.6% 7.58702 51.7% 132.791 5.95%
>
> wow, that's horrible performance. I have some of these disks,
> and have never seen anything that bad. have you verified that
> the ide controller is being recognized properly, that it's
> using a decent udma33 or udma6 mode, etc?

I think I've located the problem... (with kernel 2.4, anyway) - it is
refusing to use DMA.

When I try hdparm -m -c -d1 -a, I get the following output:

/dev/hdc:
 setting using_dma to 1 (on)
 HDIO_SET_DMA failed: Operation not permitted
 multcount = 16 (on)
 I/O support = 1 (32-bit)
 using_dma = 0 (off)
 readahead = 8 (on)

The other values are the same as the ones I'm using under the other
kernels. My dmesg output looks like this:

ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PIIX4: IDE controller on PCI bus 00 dev f9
PIIX4: chipset revision 2
PIIX4: not 100% native mode: will probe irqs later
hda: IBM-DJNA-371350, ATA DISK drive
hdb: Maxtor 54098U8, ATA DISK drive
hdc: Maxtor 54098U8, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 26520480 sectors (13578 MB) w/1966KiB Cache, CHS=1650/255/63
hdb: 80041248 sectors (40981 MB) w/2048KiB Cache, CHS=4982/255/63
hdc: 80041248 sectors (40981 MB) w/2048KiB Cache, CHS=79406/16/63
Partition check:
 hda: hda1 hda2 hda3
 hdb: hdb1
 hdc: [PTBL] [4982/255/63] hdc1

Can anyone explain to me what the [PTBL] bit means? I've been wondering
this for about 4 years now, and still don't know :)

Anybody have any suggestions about how to get DMA working? Is it a problem
with the IDE controller?

> > I haven't tried any of this with SCSI disks, so the problem may well lie
> > in VFS or the VM subsystem (I know there are problems there as well). I
> > don't know anything about those, so I guess I should just leave it to the
> > professionals now :)
>
> hmm, I just booted my dev machine with mem=32m, rather than it's normal
> 128M. it gave the same bonnie results as with 128m.

The amount of RAM shouldn't affect disk results. If it does, then the test
file size you are using with the benchmarking tool isn't big enough. So,
basically, this is the right result.

Thanks,

Corin

/------------------------+-------------------------------------\
| Corin Hartland-Swann | Direct: +44 (0) 20 7544 4676 |
| Commerce Internet Ltd | Mobile: +44 (0) 79 5854 0027 |
| 22 Cavendish Buildings | Tel: +44 (0) 20 7491 2000 |
| Gilbert Street | Fax: +44 (0) 20 7491 2010 |
| Mayfair | Web: http://www.commerce.uk.net/ |
| London W1K 5HJ | E-Mail: cdhs@commerce.uk.net |
\------------------------+-------------------------------------/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Aug 15 2000 - 21:00:24 EST