RE: 2.6.3 userspace freeze

From: Anders K. Pedersen
Date: Fri Mar 12 2004 - 03:50:56 EST


Hello again,

To night I reproduced this issue with Linux 2.6.4 final - I've attached
the dmesg and .config for this kernel.

Output from vmstat 1 before the freeze:

procs memory swap io system
cpu
r b w swpd free buff cache si so bi bo in cs us
sy id
0 0 0 0 459408 241776 642496 0 0 0 208 1166 548 5
2 92
0 0 0 0 459408 241780 642492 0 0 0 88 1241 540 9
8 84
0 0 0 0 458320 241780 642492 0 0 0 0 1202 456 6
1 93
0 0 0 0 458320 241784 642488 0 0 0 80 1161 440 1
1 98
0 0 0 0 458384 241784 642488 0 0 0 0 1184 460 5
1 94
0 0 0 0 458400 241784 642488 0 0 0 56 1189 437 4
1 95
0 0 0 0 458400 241784 642488 0 0 0 28 1148 424 1
8 91
0 0 0 0 459936 241784 642556 0 0 0 124 1229 520 7
1 92
0 0 0 0 459936 241792 642616 0 0 64 60 1185 444 1
1 98
0 0 0 0 458272 241792 642616 0 0 4 0 1237 501 11
3 86
0 0 0 0 458272 241792 642616 0 0 0 0 1173 414 3
0 97
1 0 0 0 460128 241792 642616 0 0 0 0 1162 390 1
7 92
0 0 0 0 459936 241800 642608 0 0 0 272 1266 646 11
4 86
0 0 0 0 459936 241808 642600 0 0 0 64 1194 497 1
1 98
0 0 0 0 459936 241808 642600 0 0 0 0 1169 449 3
1 96
0 0 0 0 459936 241808 642668 0 0 32 0 1134 374 1
1 99
1 0 0 0 457952 241816 642660 0 0 20 0 1099 363 4
7 89
0 0 0 0 457968 241820 642656 0 0 0 88 1118 371 1
2 97
0 0 0 0 457968 241824 642720 0 0 16 144 1186 444 1
1 99
0 0 0 0 457840 241824 642720 0 0 0 48 1148 385 0
0 99
0 0 0 0 457648 241840 642704 0 0 16 116 1163 452 1
1 99
procs memory swap io system
cpu
r b w swpd free buff cache si so bi bo in cs us
sy id
1 0 0 0 455464 241840 642704 0 0 24 0 1122 369 4
3 94
0 0 0 0 457640 241848 642696 0 0 0 136 1198 454 5
8 87
2 0 0 0 457576 241848 642696 0 0 0 64 1218 482 5
1 94
0 0 0 0 457576 241848 642696 0 0 0 128 1138 397 1
0 99
0 0 0 0 457576 241848 642696 0 0 0 0 1132 413 1
0 99
0 0 0 0 457576 241848 642696 0 0 0 92 1198 511 2
0 98
1 0 0 0 457256 241852 642760 0 0 12 116 1177 539 5
8 88
1 0 0 0 457768 241856 642756 0 0 0 64 1224 483 2
1 96
0 0 0 0 457656 241856 642756 0 0 4 64 1162 456 2
2 96
0 0 0 0 457656 241856 642756 0 0 0 0 1174 420 3
0 97
0 0 0 0 457656 241856 642756 0 0 0 120 1358 588 4
1 96
0 0 0 0 457592 241860 642752 0 0 0 200 1179 437 8
8 85
0 0 0 0 457592 241860 642752 0 0 0 64 1184 474 1
0 98
0 0 0 0 457656 241860 642752 0 0 0 212 1176 438 5
1 94
1 0 0 0 457656 241868 642812 0 0 12 0 1123 391 4
1 96
0 0 0 0 457464 241868 642812 0 0 0 128 1108 387 2
1 98
1 0 0 0 453048 241872 642808 0 0 0 60 1160 537 40
10 50
0 0 0 0 450664 241872 642808 0 0 0 60 1395 681 15
1 84
0 0 0 0 450664 241872 642808 0 0 4 0 1414 616 6
1 93
0 0 0 0 450600 241872 642808 0 0 0 0 1159 390 6
0 93
4 0 0 0 449792 241876 642804 0 0 4 88 1151 405 2
2 96
procs memory swap io system
cpu
r b w swpd free buff cache si so bi bo in cs us
sy id
1 0 0 0 437864 241952 651908 0 0 8776 132 1385 754 44
21 35
1 0 0 0 433376 241956 651904 0 0 4 964 1184 448 48
4 48
1 0 0 0 428320 241960 651900 0 0 8 8 1092 358 47
6 46
1 0 0 0 422432 241960 651900 0 0 12 0 1112 345 49
5 46
1 0 0 0 415264 241960 651900 0 0 0 268 1090 363 48
9 44
1 0 0 0 405536 241960 651900 0 0 8 128 1160 423 51
18 31
19 0 16 0 687024 241972 651956 0 0 0 568 1341 301 18
65 18
2 0 0 0 709040 242668 654048 0 0 1952 0 1283 1829 66
34 0
3 0 0 0 716656 242812 653292 0 0 132 1608 1136 4260 86
14 0
2 0 0 0 714544 242864 653308 0 0 16 0 1059 5388 88
12 0
2 0 0 0 708736 242884 657164 0 0 3256 16 1346 3042 80
20 0
2 0 0 0 709056 243112 660812 0 0 4516 428 1766 2697 67
12 20
2 0 0 0 702080 243860 661016 0 0 744 0 1220 597 84
16 0
2 0 0 0 694272 244836 660992 0 0 976 1700 1310 852 86
13 1
3 0 0 0 705176 245260 660976 0 0 424 144 1179 677 18
79 3
3 0 0 0 699992 245436 660936 0 0 176 0 1119 347 11
89 0
0 1 1 0 693464 245944 660972 0 0 480 1314 1180 615 12
65 23
5 0 0 0 630272 246636 661028 0 0 688 648 1302 744 57
36 7
0 1 0 0 569640 246884 661120 0 0 352 0 1654 1114 56
42 2
0 1 0 0 555880 248504 661472 0 0 1880 0 1781 1418 28
16 56
0 1 0 0 547304 250128 661548 0 0 1624 0 1543 1208 11
22 67
procs memory swap io system
cpu
r b w swpd free buff cache si so bi bo in cs us
sy id
0 1 0 0 545016 250924 661568 0 0 788 1004 1301 921 3
4 93
1 1 0 0 541368 251856 661588 0 0 928 1780 1511 1095 25
7 67
SOFTDOG: Initiating system reboot.

Last output from top before it froze:

04:02:21 up 2:11, 2 users, load average: 1.13, 0.36, 0.17
271 processes: 270 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states: 30.0% user 30.3% system 0.2% nice 14.1% iowait 24.3%
idle
CPU1 states: 16.3% user 27.2% system 1.1% nice 49.1% iowait 7.0%
idle
Mem: 2073128k av, 1523840k used, 549288k free, 0k shrd, 250020k
buff
860220k active, 513392k inactive
Swap: 8384880k av, 0k used, 8384880k free 661520k
cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU
COMMAND
16 0 114M 106M 11320 S 59.8 5.2 0:58 1 httpd
23536 root 34 19 1664 720 1368 D N 6.0 0.0 0:00 1
updatedb
1994 root 17 0 2416 1396 1936 R 2.7 0.0 3:44 0 top
23699 paradona 23 0 4660 3428 2280 S 2.7 0.1 0:00 0
topic.cgi
23683 root 15 0 30328 28M 1712 S 2.1 1.4 0:00 0
rotatelogspsoft
1853 root 16 0 1420 472 1356 S 0.9 0.0 1:02 0 vmstat
1552 root 16 0 373M 61M 44620 S 0.3 3.0 0:18 0
caspeng
1615 root 16 0 2068 1064 1864 S 0.3 0.0 0:20 1
soagent
23640 root 25 0 3660 752 3388 S 0.3 0.0 0:00 0
rotatelogs
23656 root 25 0 3660 752 3388 S 0.3 0.0 0:00 1
rotatelogs
23684 root 15 0 2696 1416 1712 S 0.3 0.0 0:00 1
rotatelogspsoft
23685 root 15 0 2584 1300 1712 S 0.3 0.0 0:00 1
rotatelogspsoft
939 root 16 0 6132 1824 5788 S 0.1 0.0 0:01 1 sshd
23622 root 25 0 3656 748 3388 S 0.1 0.0 0:00 1
rotatelogs
23623 root 25 0 3656 748 3388 S 0.1 0.0 0:00 0
rotatelogs
23624 root 25 0 3656 748 3388 S 0.1 0.0 0:00 1
rotatelogs
23625 root 25 0 3656 748 3388 S 0.1 0.0 0:00 0
rotatelogs
23626 root 25 0 3656 748 3388 S 0.1 0.0 0:00 1
rotatelogs
23627 root 25 0 3656 748 3388 S 0.1 0.0 0:00 0
rotatelogs
23628 root 25 0 3656 748 3388 S 0.1 0.0 0:00 1
rotatelogs
23629 root 25 0 3656 748 3388 S 0.1 0.0 0:00 0
rotatelogs
23630 root 25 0 3660 752 3388 S 0.1 0.0 0:00 1
rotatelogs
23631 root 25 0 3660 752 3388 S 0.1 0.0 0:00 0
rotatelogs
23632 root 25 0 3660 752 3388 S 0.1 0.0 0:00 1
rotatelogs
23633 root 25 0 3660 752 3388 S 0.1 0.0 0:00 0
rotatelogs
23634 root 25 0 3660 748 3388 S 0.1 0.0 0:00 1
rotatelogs
23635 root 25 0 3660 748 3388 S 0.1 0.0 0:00 0
rotatelogs
23636 root 25 0 3660 748 3388 S 0.1 0.0 0:00 0
rotatelogs
23637 root 25 0 3660 748 3388 S 0.1 0.0 0:00 1
rotatelogs
23638 root 25 0 3660 752 3388 S 0.1 0.0 0:00 0
rotatelogs
23639 root 25 0 3660 752 3388 S 0.1 0.0 0:00 1
rotatelogs
23641 root 25 0 3660 752 3388 S 0.1 0.0 0:00 1
rotatelogs
23642 root 25 0 3660 752 3388 S 0.1 0.0 0:00 0
rotatelogs
23643 root 25 0 3660 752 3388 S 0.1 0.0 0:00 1
rotatelogs

If there is anything I can do to debug the reason for these deadlocks,
please let me know.

Regards,
Anders K. Pedersen

> -----Original Message-----
> From: Anders K. Pedersen
> Sent: Thursday, March 11, 2004 1:46 AM
> To: 'Jan Kara'
> Cc: Andrew Morton; linux-kernel@xxxxxxxxxxxxxxx
> Subject: RE: 2.6.3 userspace freeze

> > -----Original Message-----
> > From: Jan Kara [mailto:jack@xxxxxxx]
> > Sent: Wednesday, March 10, 2004 4:44 PM
> > To: Anders K. Pedersen
> > Cc: Andrew Morton; linux-kernel@xxxxxxxxxxxxxxx
> > Subject: Re: 2.6.3 userspace freeze
>
> > > I will try this to night; just to make sure I understand
> > you correctly:
> > > You just want me to turn off quotas on all file systems
> > (currently they
> > > are in use on one of them), and it is not necessary to
> recompile the
> > > kernel without quota support?
> > Yes, just turning quotas off with quotaoff(8) should be
> > enough to rule
> > out possible deadlocks caused by quotas.
>
> I just tried booting the 2.6.3 kernel without any qoutas
> enabled in fstab, and it failed like previously after only 17 minutes.

Attachment: 2.6.4-dmesg
Description: 2.6.4-dmesg

Attachment: 2.6.4-config
Description: 2.6.4-config