Performance gap between 2.2.14 and 2.4.0-test4 kernels

From: Gianluca Cecchi (gianluca.cecchi@teras.it)
Date: Mon Jul 24 2000 - 00:12:14 EST


The system:

MB: Supermicro P6SBU (Adaptec 7890 on board)
CPU: 1 pentium III 500 MHz
Mem: 256Mb

1x9.1Gb IBM DNES-309170W disk on fast/se channel
4x18.2Gb IBM DNES-318350W on ultra2 channel
The 18.2 Gb disks are in raid0 software. Below the /etc/raidtab file:

raiddev /dev/md0
          raid-level 0
          nr-raid-disks 4
          persistent-superblock 1
          chunk-size 128
          device /dev/sdb1
          raid-disk 0
          device /dev/sdc1
          raid-disk 1
          device /dev/sdd1
          raid-disk 2
          device /dev/sde1
          raid-disk 3

output of dmesg related to scsi conf (in 2.4.0-test4 boot):

md.c: sizeof(mdp_super_t) = 4096
(scsi0) <Adaptec AIC-7890/1 Ultra2 SCSI host adapter> found at PCI 0/14/0
(scsi0) Wide Channel, SCSI ID=7, 32/255 SCBs
(scsi0) Downloading sequencer code... 392 instructions downloaded
scsi0 : Adaptec AHA274x/284x/294x (EISA/VLB/PCI-Fast SCSI) 5.2.1/5.2.0
       <Adaptec AIC-7890/1 Ultra2 SCSI host adapter>
scsi : 1 host.
(scsi0:0:5:0) Synchronous at 10.0 Mbyte/sec, offset 15.
  Vendor: SONY Model: SDT-9000 Rev: 0400
  Type: Sequential-Access ANSI SCSI revision: 02
(scsi0:0:6:0) Synchronous at 40.0 Mbyte/sec, offset 31.
  Vendor: IBM Model: DNES-309170W Rev: SA30
  Type: Direct-Access ANSI SCSI revision: 03
Detected scsi disk sda at scsi0, channel 0, id 6, lun 0
(scsi0:0:8:0) Synchronous at 80.0 Mbyte/sec, offset 31.
  Vendor: IBM Model: DNES-318350W Rev: SA30
  Type: Direct-Access ANSI SCSI revision: 03
Detected scsi disk sdb at scsi0, channel 0, id 8, lun 0
(scsi0:0:9:0) Synchronous at 80.0 Mbyte/sec, offset 31.
  Vendor: IBM Model: DNES-318350W Rev: SA30
  Type: Direct-Access ANSI SCSI revision: 03
Detected scsi disk sdc at scsi0, channel 0, id 9, lun 0
(scsi0:0:10:0) Synchronous at 80.0 Mbyte/sec, offset 31.
  Vendor: IBM Model: DNES-318350W Rev: SA30
  Type: Direct-Access ANSI SCSI revision: 03
Detected scsi disk sdd at scsi0, channel 0, id 10, lun 0
(scsi0:0:12:0) Synchronous at 80.0 Mbyte/sec, offset 31.
  Vendor: IBM Model: DNES-318350W Rev: SA30
  Type: Direct-Access ANSI SCSI revision: 03
Detected scsi disk sde at scsi0, channel 0, id 12, lun 0
scsi : detected 5 SCSI disks total.
SCSI device sda: hdwr sector= 512 bytes. Sectors= 17916240 [8748 MB] [8.7 GB]
Partition check:
 sda: sda1 sda2 < sda5 sda6 sda7 >
SCSI device sdb: hdwr sector= 512 bytes. Sectors= 35843670 [17501 MB] [17.5 GB]
 sdb: sdb1
SCSI device sdc: hdwr sector= 512 bytes. Sectors= 35843670 [17501 MB] [17.5 GB]
 sdc: sdc1
SCSI device sdd: hdwr sector= 512 bytes. Sectors= 35843670 [17501 MB] [17.5 GB]
 sdd: sdd1
SCSI device sde: hdwr sector= 512 bytes. Sectors= 35843670 [17501 MB] [17.5 GB]
 sde: sde1

[snipped]

(read) sdb1's sb offset: 17920384 [events: 00000085]
(read) sdc1's sb offset: 17920384 [events: 00000085]
(read) sdd1's sb offset: 17920384 [events: 00000085]
(read) sde1's sb offset: 17920384 [events: 00000085]
autorun ...
considering sde1 ...
  adding sde1 ...
  adding sdd1 ...
  adding sdc1 ...
  adding sdb1 ...
created md0
bind<sdb1,1>
bind<sdc1,2>
bind<sdd1,3>
bind<sde1,4>
running: <sde1><sdd1><sdc1><sdb1>
now!
sde1's event counter: 00000085
sdd1's event counter: 00000085
sdc1's event counter: 00000085
sdb1's event counter: 00000085
raid0 personality registered
md0: max total readahead window set to 2048k
md0: 4 data-disks, max readahead per data-disk: 512k
raid0: looking at sdb1
raid0: comparing sdb1(17920384) with sdb1(17920384)
raid0: END
raid0: ==> UNIQUE
raid0: 1 zones
raid0: looking at sdc1
raid0: comparing sdc1(17920384) with sdb1(17920384)
raid0: EQUAL
raid0: looking at sdd1
raid0: comparing sdd1(17920384) with sdb1(17920384)
raid0: EQUAL
raid0: looking at sde1
raid0: comparing sde1(17920384) with sdb1(17920384)
raid0: EQUAL
raid0: FINAL 1 zones
zone 0
 checking sdb1 ... contained as device 0
  (17920384) is smallest!.
 checking sdc1 ... contained as device 1
 checking sdd1 ... contained as device 2
 checking sde1 ... contained as device 3
 zone->nb_dev: 4, size: 71681536
current zone offset: 17920384
done.
raid0 : md_size is 71681536 blocks.
raid0 : conf->smallest->size is 71681536 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
md: updating md0 RAID superblock on device
sde1 [events: 00000086](write) sde1's sb offset: 17920384
sdd1 [events: 00000086](write) sdd1's sb offset: 17920384
sdc1 [events: 00000086](write) sdc1's sb offset: 17920384
sdb1 [events: 00000086](write) sdb1's sb offset: 17920384
.
... autorun DONE.
Detected scsi tape st0 at scsi0, channel 0, id 5, lun 0
st: bufsize 32768, wrt 30720, max init. buffers 4, s/g segs 16.

These are the outputs of bonnie++ version 1.00 compiled on 2.2.14 kernel ( redhat 6.2)

kernel 2.2.14 no tagged
Version 1.00 ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine MB K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
Unknown 1000 8400 97 56908 70 21380 48 8475 96 58199 44 nan -21474836
48
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
                 30 175 96 627 99 6451 99 182 99 806 99 722 91
Unknown,1000,8400,97,56908,70,21380,48,8475,96,58199,44, nan,-2147483648,30,175,96,62
7,99,6451,99,182,99,806,99,722,91

kernel 2.4.0-test4 no tagged
Version 1.00 ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine MB K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
Unknown 1000 8249 97 51642 37 10498 18 6190 72 17248 19 nan -2147483648
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
                 30 174 99 +++++ 93 9417 93 180 99 +++++ 104 1282 98
Unknown,1000,8249,97,51642,37,10498,18,6190,72,17248,19, nan,-2147483648,30,174,99,+++++,93,9417,93,180,99,+++++,104,1282,98
 
What about them? In particular the big difference between rewrite seq. output (21380 48 % cpu in 2.2 vs 10498 18%
in 2.4) and block seq. input (58199 44% in 2.2 vs 17248 19% in 2.4)?

Making dd or cp of big files the performances of 2.4 remains 1/3 respect of 2.2.14 (due to seq. input bad performance??),
eg. 62 secs for 512MB dd with 1MB block size versus 23 secs in 2.2.14). The cpu load is 45% in 2.2 versus 18% in 2.4.
The problem with kswapd overload seems not to be so present analyzing vmstat, but the
performance gap remains.
Is philosophical (logical) change toward multi user/multi processor environment or bad performance?
Thanks in advance for your clarifications.
Gianluca

PS: tell me if I can be of any help for testing conditions with my hardware.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jul 31 2000 - 21:00:15 EST