Re: writeout stalls in current -git

From: Torsten Kaiser
Date: Sun Nov 04 2007 - 06:19:34 EST


On 11/2/07, David Chinner <dgc@xxxxxxx> wrote:
> That's stalled waiting on the inode cluster buffer lock. That implies
> that the inode lcuser is already being written out and the inode has
> been redirtied during writeout.
>
> Does the kernel you are testing have the "flush inodes in ascending
> inode number order" patches applied? If so, can you remove that
> patch and see if the problem goes away?

I can now confirm, that I see this also with the current mainline-git-version
I used 2.6.24-rc1-git-b4f555081fdd27d13e6ff39d455d5aefae9d2c0c
plus the fix for the sg changes in ieee1394.
Bisecting would be troublesome, as the sg changes prevent mainline to
boot with my normal config / kill my network.

treogen ~ # vmstat 10
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
-> starting emerge
1 0 0 3627072 332 157724 0 0 97 13 41 189 2 2 94 2
0 0 0 3607240 332 163736 0 0 599 10 332 951 2 1 93 4
0 0 0 3601920 332 167592 0 0 380 2 218 870 1 1 98 0
0 0 0 3596356 332 171648 0 0 404 21 182 818 0 0 99 0
0 0 0 3579328 332 180436 0 0 878 12 147 912 1 1 97 2
0 0 0 3575376 332 182776 0 0 236 4 244 953 1 1 95 3
2 1 0 3571792 332 185084 0 0 232 7 256 1003 2 1 95 2
0 0 0 3564844 332 187364 0 0 228 605 246 1167 2 1 93 4
0 0 0 3562128 332 189784 0 0 230 4 527 1238 2 1 93 4
0 1 0 3558764 332 191964 0 0 216 24 438 1059 1 1 93 6
0 0 0 3555120 332 193868 0 0 199 36 406 959 0 0 92 8
0 0 0 3552008 332 195928 0 0 197 11 458 1023 1 1 90 8
0 0 0 3548728 332 197660 0 0 183 7 496 1086 1 1 90 8
0 0 0 3545560 332 199372 0 0 170 8 483 1017 1 1 90 9
0 1 0 3542124 332 201256 0 0 190 1 544 1137 1 1 88 10
1 0 0 3536924 332 203296 0 0 195 7 637 1209 2 1 89 8
1 1 0 3485096 332 249184 0 0 101 16 10372 4537 13 3 76 8
2 0 0 3442004 332 279728 0 0 1086 40 219 1349 7 3 87 4
-> emerge is done reading its package database
1 0 0 3254796 332 448636 0 0 0 27 128 8360 24 6 70 0
2 0 0 3143304 332 554016 0 0 47 33 213 4480 16 11 72 1
-> kernel unpacked
1 0 0 3125700 332 560416 0 0 1 20 122 1675 24 1 75 0
1 0 0 3117356 332 567968 0 0 0 674 157 2975 24 2 73 1
2 0 0 3111636 332 573736 0 0 0 1143 151 1924 23 1 75 1
2 0 0 3102836 332 581332 0 0 0 890 153 1330 24 1 75 0
1 0 0 3097236 332 587360 0 0 0 656 194 1593 24 1 74 0
1 0 0 3086824 332 595480 0 0 0 812 235 2657 25 1 74 0
-> tar.bz2 created, installing starts now
0 0 0 3091612 332 601024 0 0 82 708 499 2397 17 4 78 1
0 0 0 3086088 332 602180 0 0 69 2459 769 2237 3 4 88 6
0 0 0 3085916 332 602236 0 0 2 1752 693 949 1 2 96 1
0 0 0 3084544 332 603564 0 0 66 4057 1176 2850 3 6 91 0
0 0 0 3078780 332 605572 0 0 98 3194 1169 3288 5 6 89 0
0 0 0 3077940 332 605924 0 0 17 1139 823 1547 1 2 97 0
0 0 0 3078268 332 605924 0 0 0 888 807 1329 0 1 99 0
-> first short stall
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 3077040 332 605924 0 0 0 1950 785 1495 0 2 89 8
0 0 0 3076588 332 605896 0 0 2 3807 925 2046 1 4 95 0
0 0 0 3076900 332 606052 0 0 11 2564 768 1471 1 3 95 1
0 0 0 3071584 332 607928 0 0 87 2499 1108 3433 4 6 90 0
-> second longer stall
(emerge was not able to complete a single filemove until the 'resume' line)
0 0 0 3071592 332 607928 0 0 0 693 692 1289 0 0 99 0
0 0 0 3072584 332 607928 0 0 0 792 731 1507 0 1 99 0
0 0 0 3072840 332 607928 0 0 0 806 707 1521 0 1 99 0
0 0 0 3072724 332 607928 0 0 0 782 695 1372 0 0 99 0
0 0 0 3072972 332 607928 0 0 0 677 612 1301 0 0 99 0
0 0 0 3072772 332 607928 0 0 0 738 681 1352 1 1 99 0
0 0 0 3073020 332 607928 0 0 0 785 708 1328 0 1 99 0
0 0 0 3072896 332 607928 0 0 0 833 722 1383 0 0 99 0
-> emerge resumed
0 0 0 3069476 332 607972 0 0 2 4885 812 2062 1 4 90 5
1 0 0 3069648 332 608068 0 0 4 4658 833 2158 1 4 93 2
0 0 0 3064972 332 610364 0 0 106 2494 1095 3620 5 7 88 0
0 0 0 3057536 332 612444 0 0 86 2023 1012 3440 4 6 90 0
1 0 0 3054572 332 612368 0 0 102 1526 1024 2277 6 5 87 2
-> emerge finished, but still >100Mb of dirty data accoring to /proc/meminfo
0 0 0 3048548 332 615764 0 0 337 659 796 1000 3 1 96 0
0 0 0 3092100 332 615860 0 0 15 616 606 1040 1 0 99 0
0 0 0 3092148 332 615860 0 0 0 641 622 1085 0 0 99 0
0 0 0 3092528 332 615860 0 0 0 766 654 1055 1 1 99 0
-> slow writeout until here, might be fixed with Peters patch to scale
the background threshold
2 0 0 3090828 332 615860 0 0 0 1804 707 1215 0 2 98 0
0 0 0 3091056 332 615864 0 0 0 3877 831 2047 1 4 94 1
3 0 0 3090780 332 615864 0 0 0 2048 784 1154 1 2 97 1
0 0 0 3091096 332 615864 0 0 0 2690 751 1538 0 3 96 1
0 1 0 3091056 332 615864 0 0 0 2018 748 866 0 2 95 2
2 0 0 3092960 332 615864 0 0 0 2076 719 1118 0 2 97 0
-> writeout "done", /proc/meminfo showed 0kb of dirty data remaining
0 0 0 3093072 332 615864 0 0 0 645 646 1104 0 0 99 0
0 0 0 3093532 332 615864 0 0 0 726 658 1223 0 1 99 0
0 0 0 3093540 332 615864 0 0 0 801 699 1314 0 1 99 0
0 0 0 3093580 332 615864 0 0 0 783 738 1350 0 1 99 0
0 0 0 3093284 332 615920 0 0 6 746 655 1381 1 1 98 0
0 0 0 3092872 332 615920 0 0 0 862 703 1391 1 1 98 0
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 3093224 332 615920 0 0 0 799 676 1394 0 0 99 0
0 0 0 3093304 332 615920 0 0 0 835 672 1514 1 1 98 0
0 0 0 3093476 332 615920 0 0 0 784 641 1404 1 1 98 0
0 0 0 3093264 332 615920 0 0 0 722 626 1483 1 1 99 0
0 0 0 3093476 332 615920 0 0 0 7 328 350 0 0 99 0
0 0 0 3093628 332 615920 0 0 0 11 332 407 0 0 99 0
-> disks finally go idle

Torsten

.config for 2.6.24-rc1+git attached

Attachment: config.gz
Description: GNU Zip compressed data