Re: server migration

From: Lawrence Walton
Date: Sat Mar 06 2004 - 20:39:14 EST


Denis Vlasenko [vda@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] wrote:
> On Friday 05 March 2004 20:13, Lawrence Walton wrote:
> > Hi all!
> >
> > I tried about four months ago to migrate a busy server to 2.6.0-test9,
> > and failed miserably. Lightly loaded it worked well but as the number
> > of users increased, the number of processes in uninterruptible sleep
> > increased to the hundreds and then the server fell on it's face. I never
> > found out exactly why or what processes where hanging if I guessed it
> > would be openldap.
>
> Why do you guess? Determine what processes are stuck.
>
Because I did not expect it to happen, I had lots of users screaming at
me to fix it now, when it did happen. The server had been up sense the
night before. It was not until users started showing up in the morning
that the problem manifested itself.

The point is I was hoping to get a list of things to try to capture in
case it happened again, testing is all well and good, but getting
information from a production box can be valuable, as long as it's not
some odd corner case.

Capturing SysRq-T was on my list to do.
I'll investigate stack pointers, and If I can post stack traces.

I was hoping to get pointers like below before I tried it again.

<snip>
> Compile with stack pointers, capture SysRq-T, post stack traces
> of D processes to lkml.
> --
> vda
>

--
*--* Mail: lawrence@xxxxxxxx
*--* Voice: 425.739.4247
*--* Fax: 425.827.9577
*--* HTTP://the-penguin.otak.com/~lawrence
--------------------------------------
- - - - - - O t a k i n c . - - - - -


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/