Re: Strange freeze on 2.6.22 (deadlock?)

From: Brice Figureau
Date: Mon Jan 07 2008 - 14:06:21 EST



On Mon, January 7, 2008 18:20, Randy Dunlap wrote:
> On Mon, 07 Jan 2008 16:48:54 +0100 Brice Figureau wrote:
>
>> Hi,
>>
>> I'm seeing a strange complete server freeze/lock-up on an bi-Xeon HT
>> amd64 server running standard debian 2.6.22 (and before that vanilla
>> 2.6.19.x and 2.6.20.x which exhibited the same issue).
>>
>> I'm only reporting it now, since I could get a full sysrq-t only this
>> morning.
>>
>> The symptoms are that every 5 to 7 days, the server (which acts as a MX
>> along with a few low traffic websites) locks-up. The ipmi watchdog is
>> unable to reboot the server (and doesn't even trigger, since there is no
>> evidence in the esmlog), the machine is still pingable. I can't ssh to
>> it, but I can enter my login & password on a serial console, but no
>> shell is started.
>>
>> Pressing sysrq-t produced the trace hosted here:
>> http://www.daysofwonder.com/bug/crash-server1.txt.gz
>>
>> It happened one time when I was connected to the server through ssh and
>> I could see that the load started to increase well above 100. It was
>> then impossible to launch new process from the command-line (and I had
>> to reboot manually).
>> It happened also last week, and the server was stuck for about 6 hours.
>> When I started investigating what was wrong, it slowly came back to life
>> (with an avg 1-min load of more than 1500, and tons of cron processes
>> running in parallel).
>>
>> I'm not really familiar with kernel development so I can't really find
>> the issue in the aforementioned trace output.
>> What I think is that for some reason there is a race/deadlock that
>> finally prevents new processes to really start (which in turns produces
>> the high load).
>>
>> What seems suspect in the aforementioned trace is:
>> *) lot of processes stacktrace ends in __mod_timer+0xc3/0xd3
>> which seems to be this line from kernel/timer.c
>>
>> 415 timer->expires = expires;
>> 416 internal_add_timer(base, timer);
>> --> spin_unlock_irqrestore(&base->lock, flags);
>>
>> 419 return ret;
>> 420 }
>>
>> *) lot of processes stacktrace ends in __mutex_lock_slowpath and/or
>> zone_statistics
>
> There are also lots of processes in D state (usually waiting
> for I/O to complete). And jbd is in their stack traces.
>
> How is/are the ext3 filesystems mounted? I mean what data=xyz
> mode? data=journal (the heaviest duty mode) has at least one
> known deadlock. If you are using data=journal, you could try
> switching to data=ordered...

Thanks for the answer.
I'm using whatever is the default mount option (which I think is
data=ordered). The only other mount option I use is nodiratime,noatime.

Note that a large part of the processes in D state are "waiting" in
__mutex_lock from generic_file_aio_write.
Another large part is coming from balance_dirty_pages_ratelimited_nr.

It seems that there was some writeback congestion to the block device.
All the /proc/sys/vm/* files are at their defaults.

This looks like if it wasn't possible to write to the block device anymore.
Could a block device write error (ie hardware failure) be the root cause?

Any other idea?
What should I try the next time it freezes?
--
Brice Figureau

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/