There's something rotten in 1.3.x

Leonard N. Zubkoff (lnz@dandelion.com)
Sat, 6 Apr 1996 15:49:05 -0800


I happened to be doing some testing and noticed a serious performance anomaly
between Linux 1.3.84 and 1.2.13. Specifically, I have a test procedure which
executes the following 4 steps:

Step 1: Untar and patch an XFree86 3.1.1 source tree from the raw X11R6
xc-[123].tar.gz and contrib-[1234].tar.gz, and XFree86 patches.

Step 2: Compute MD5 checksums for the entire tree that was created writing
to a log file.

Step 3: Diff the log file from Step 2 with a master log file of the
correct MD5 checksums.

Step 4: Delete the tree created in Step 1.

In the past I've found this particular test procedure to be useful in verifying
that the disk and memory subsystems are operating correctly. Even single bit
errors in disk or memory transfers of gzip'ed tar files will always show up
under this procedure. I'm using a EXT2FS with 1KB block size.

In my present copy of 1.2.13, I get the following timings for these four steps
followed by the time for the whole procedure:

Step 1: 101.320u 78.930s 5:39.94 53.0% 0+0k 0+0io 1414pf+0w
Step 2: 52.690u 37.230s 5:10.86 28.9% 0+0k 0+0io 21pf+0w
Step 3: 0.320u 0.680s 0:03.92 25.5% 0+0k 0+0io 42pf+0w
Step 4: 0.580u 13.500s 2:50.45 8.2% 0+0k 0+0io 22pf+0w
Total: 154.920u 130.390s 13:45.25 34.5% 0+0k 0+0io 1522pf+0w

But in 1.3.84, we have:

Step 1: 102.250u 71.450s 5:34.31 51.9% 0+0k 0+0io 26668pf+0w
Step 2: 53.610u 429.690s 11:01.45 73.0% 0+0k 0+0io 102567pf+0w
Step 3: 0.320u 0.530s 0:03.29 25.8% 0+0k 0+0io 2054pf+0w
Step 4: 0.510u 10.700s 2:25.32 7.7% 0+0k 0+0io 50pf+0w
Total: 156.710u 512.370s 19:04.47 58.4% 0+0k 0+0io 131440pf+0w

And in 1.3.59:

Step 1: 102.030u 72.090s 5:52.25 49.4% 0+0k 0+0io 26653pf+0w
Step 2: 54.340u 441.460s 12:06.85 68.2% 0+0k 0+0io 102570pf+0w
Step 3: 0.320u 0.600s 0:04.26 21.5% 0+0k 0+0io 2047pf+0w
Step 4: 0.660u 11.090s 2:24.93 8.1% 0+0k 0+0io 48pf+0w
Total: 157.360u 525.280s 20:28.39 55.5% 0+0k 0+0io 131419pf+0w

The interesting difference here is in Step 2, where there's an enormous amount
of system time in 1.3.x compared to 1.2.13. I've never played with kernel
profiling before, so this seemed like as good a time as any. Alas, for some
reason readprofile-1.1 didn't work out of the box so I had to fix it first. In
any event, here's the kernel profiling data for Step 2 in 1.3.84. It seems
pretty obvious that get_empty_inode has some serious problems...

Leonard

1 _do_signal 0.0011
73 _system_call 0.4562
65 ret_from_sys_call 0.4514
2 error_code 0.0208
1 _page_fault 0.0027
2 _do_page_fault 0.0006
146 _schedule 0.2340
73 _wake_up 0.3318
18 _wake_up_interruptible 0.0833
25 _interruptible_sleep_on 0.1330
1 _sys_syslog 0.0019
3 _printk 0.0099
1 _in_group_p 0.0147
1 _sys_sigaction 0.0025
1 __getitimer 0.0052
1 _clear_page_tables 0.0033
1 _new_page_tables 0.0042
3 _copy_page_range 0.0028
23 _do_wp_page 0.0389
82 _verify_area 0.1916
18 _do_no_page 0.0210
1 _do_mmap 0.0013
3 _do_munmap 0.0022
1 _insert_vm_struct 0.0013
1 _merge_segments 0.0008
50 _truncate_inode_pages 0.0992
19 _shrink_mmap 0.0371
56 _update_vm_cache 0.1818
51 _try_to_read_ahead 0.1678
7 ___wait_on_page 0.0365
1184 _generic_file_read 1.1888
4 _filemap_nopage 0.0064
2 _kmalloc 0.0033
1 _swap_out 0.0036
1 _try_to_free_page 0.0052
1 _kswapd 0.0033
118 _free_pages 0.2565
84 ___get_free_pages 0.1615
1 _sys_swapon 0.0007
2 _sys_utime 0.0089
7 _sys_chmod 0.0350
12 _do_open 0.0333
2 _sys_open 0.0227
5 _close_fp 0.0379
4 _sys_close 0.0500
29 _sys_read 0.2014
27 _sys_write 0.0978
22 _insert_inode_hash 0.3929
1 _grow_inodes 0.0064
168 _clear_inode 0.6462
4 _write_inode 0.0400
7 _inode_change_ok 0.0282
1 _inode_setattr 0.0074
3 _notify_change 0.0268
12 _sync_inodes 0.0566
141 _iput 0.3264
38523 _get_empty_inode 117.4482
404 ___iget 0.8279
124 _get_empty_filp 0.5536
3 ___wait_on_buffer 0.0117
459 _get_hash_table 2.0491
123 _refill_freelist 0.0907
432 _getblk 0.3830
53 _set_writetime 0.5521
296 _refile_buffer 0.2701
41 ___brelse 0.6029
1 ___bforget 0.0053
71 _bread 0.5221
31 _put_unused_buffer_head 0.4559
10 _get_more_buffer_heads 0.0926
30 _get_unused_buffer_head 0.2778
12 _create_buffers 0.1000
163 _brw_page 0.2411
70 _mark_buffer_uptodate 0.2778
23 _generic_readpage 0.1085
1 _grow_buffers 0.0041
2 _try_to_free_buffer 0.0029
2 _maybe_shrink_lav_buffers 0.0106
63 _sync_old_buffers 0.2128
2 _bdflush 0.0060
18 _cp_new_stat 0.0634
4 _sys_newstat 0.0455
6 _sys_newlstat 0.0682
1 _sys_newfstat 0.0109
1 _flush_old_exec 0.0024
512 _pipe_read 1.2427
473 _pipe_write 0.9385
106 _getname 0.3193
1 _putname 0.0625
36 _permission 0.2432
2 _put_write_access 0.1667
89 _lookup 0.3648
37 _follow_link 0.3304
90 _dir_namei 0.3041
12 __namei 0.0638
1 _lnamei 0.0139
1 _namei 0.0132
24 _open_namei 0.0247
1 _do_mkdir 0.0030
1 _do_unlink 0.0039
3 _sys_fcntl 0.0037
19 _fillonedir 0.1532
7 _old_readdir 0.0449
2 _locks_remove_locks 0.0250
509 _dcache_lookup 1.4298
266 _dcache_add 0.6584
1 _read_profile 0.0041
76 _ext2_permission 0.2794
10 _get_group_desc 0.0500
2 _load__block_bitmap 0.0034
14 _ext2_free_blocks 0.0150
202 _ext2_new_block 0.0889
2 _ext2_check_blocks_bitmap 0.0031
5 _ext2_count_free 0.0625
494 _ext2_check_dir_entry 2.0932
48 _ext2_readdir 0.0297
615 _ext2_file_write 0.5531
2 _ext2_release_file 0.0833
7 _sync_block 0.0324
1 _sync_iblock 0.0078
5 _sync_indirect 0.0347
2 _sync_dindirect 0.0139
54 _get_group_desc 0.2700
1 _read_inode_bitmap 0.0069
2 _load_inode_bitmap 0.0051
12 _inc_inode_version 0.0280
48 _ext2_new_inode 0.0301
16 _ext2_put_inode 0.2353
31 _ext2_discard_prealloc 0.3523
378 _ext2_alloc_block 0.8514
45 _ext2_bmap 0.0765
102 _inode_getblk 0.2525
33 _block_getblk 0.0543
122 _ext2_getblk 0.2194
2 _ext2_bread 0.0068
204 _ext2_read_inode 0.2615
89 _ext2_update_inode 0.1648
6 _ext2_write_inode 0.0882
242 _ext2_match 1.5125
343 _ext2_find_entry 0.4738
98 _ext2_lookup 0.2988
64 _ext2_add_entry 0.0856
1 _ext2_delete_entry 0.0089
4 _ext2_create 0.0114
1 _do_ext2_rename 0.0007
1 _trunc_indirect 0.0014
1 _trunc_dindirect 0.0021
1 _ext2_truncate 0.0029
16 _delay_10ms 0.1739
228 _unplug_device 3.8000
5 ___get_request_wait 0.0147
3 _is_read_only 0.0577
277 _add_request 0.3826
436 _make_request 0.4022
54 _ll_rw_block 0.1031
1 _do_blank_screen 0.0034
1 _get_scrmem 0.0071
1 _read_chan 0.0006
1 _id_read_eeprom 0.0054
278 _scsi_do_cmd 0.3075
1 _scsi_init_malloc 0.0089
416 _BusLogic_Command 0.4031
2 _BusLogic_HardResetHostAdapter 0.0051
400 _BusLogic_InquireTargetDevices 0.4237