Re: Processes spinning forever, apparently in lock_timer_base()?

From: richard kennedy
Date: Sat Sep 22 2007 - 08:08:55 EST


On Fri, 2007-09-21 at 03:33 -0700, Andrew Morton wrote:
> On Fri, 21 Sep 2007 11:25:41 +0100 richard kennedy <richard@xxxxxxxxxxxxxxx> wrote:
>
> > > That's all a bit crappy if the wrong races happen and some other task is
> > > somehow exceeding the dirty limits each time this task polls them. Seems
> > > unlikely that such a condition would persist forever.
> > >
> > > So the question is, why do we have large amounts of dirty pages for one
> > > disk which appear to be sitting there not getting written?
> >
> > The lockup I'm seeing intermittently occurs when I have 2+ tasks copying
> > large files (1Gb+) on sda & a small read-mainly mysql db app running on
> > sdb. The lockup seems to happen just after the copies finish -- there
> > are lots of dirty pages but nothing left to write them until kupdate
> > gets round to it.
>
> Then what happens? The system recovers?
when my system is locked up I get this from sysrq-w
(2.6.23-rc7)

SysRq : Show Blocked State
auditd D ffff8100b422fd28 0 1999 1
ffff8100b422fd78 0000000000000086 0000000000000000 ffff8100b4103020
0000000000000286 ffff8100b4103020 ffff8100bf8af020 ffff8100b41032b8
0000000100000000 0000000000000001 ffffffffffffffff 0000000000000003
Call Trace:
[<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5
[<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e
[<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee
[<ffffffff810a79f7>] __writeback_single_inode+0x1f4/0x332
[<ffffffff8108c0b0>] vfs_statfs_native+0x29/0x34
[<ffffffff810a8362>] sync_inode+0x24/0x36
[<ffffffff8803c350>] :ext3:ext3_sync_file+0xb4/0xc8
[<ffffffff81240730>] mutex_lock+0x1e/0x2f
[<ffffffff810aa7fa>] do_fsync+0x52/0xa4
[<ffffffff810aa86f>] __do_fsync+0x23/0x36
[<ffffffff8100bc8e>] system_call+0x7e/0x83

syslogd D ffff8100b3f67d28 0 2022 1
ffff8100b3f67d78 0000000000000086 0000000000000000 0000000100000000
0000000000000003 ffff810037c66810 ffff8100bf8af020 ffff810037c66aa8
0000000100000000 0000000000000001 ffffffffffffffff 0000000000000003
Call Trace:
[<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5
[<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e
[<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee
[<ffffffff810a79f7>] __writeback_single_inode+0x1f4/0x332
[<ffffffff810a8362>] sync_inode+0x24/0x36
[<ffffffff8803c350>] :ext3:ext3_sync_file+0xb4/0xc8
[<ffffffff81240730>] mutex_lock+0x1e/0x2f
[<ffffffff810aa7fa>] do_fsync+0x52/0xa4
[<ffffffff810aa86f>] __do_fsync+0x23/0x36
[<ffffffff8100be0c>] tracesys+0xdc/0xe1


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/