Re: The stability crisis

david parsons (o.r.c@p.e.l.l.p.o.r.t.l.a.n.d.o.r.u.s)
3 Jul 1999 16:53:44 -0700


In article <linux.kernel.d3lncygpwg.fsf@dxplus05.cern.ch>,
Jes Sorensen <Jes.Sorensen@cern.ch> wrote:
>>>>>> "Kim" == Kim Petersen <kim@vejlegruppen.dk> writes:
>
>Kim> Jes Sorensen <Jes.Sorensen@cern.ch> writes:
>>> One machine crashes, you have one serial cable, how much spaghetti
>>> is that?
>Kim> In a 19" rack there would still be spaghetti... but it really
>Kim> doesn't matter since what was asked for was a way to capture
>Kim> oopsen without resolving to serial-cables. I think the question
>Kim> was valid, and allthough we haven't had any oopses here, i don't
>Kim> wan't to even consider how much cleanup work, and recompiles
>Kim> etc. a reoccuring oops would make - and that even without setting
>Kim> up a serial-connection to capture an oops that we might not even
>Kim> be able to recreate, without having a possible buggy machine in a
>Kim> live network (doing normak tasks).
>
>Well I do not see spaghetti in our racks here, and we do have quite a
>few. Though we got most of the PCs in these special racks that utilize
>the space better when one sticks mini towers in them.

Not to belabor the obvious, but yours isn't the only equipment rack
in the world. In *my* experience, once I've put 3-4 machines into
a rack, plus the necessary networking plumbing to make them into a
backbone, it's close enough to spaghetti to confuse the most
discerning gourmand.

Your racks, which I'm sure each contain 30 or 40 machines, may be
an neat and tidy as pins. Celebrate. But it's foolish to assume
that everyone else is as fortunate as you.

>Kim> It would be really nice to have a way to capture oopses, that
>Kim> isn't depending on a serialcable. That way we could send the oops
>Kim> of to kernellist and immediatly get on to the work of recreating
>Kim> a stable machine.
>
>You still have to do your work manually and pipe it through ksymoops
>anyway -

So what? On the recieving end, the little UDP reader will
pre-parse the panic packet into a form that's readable by ksymoops,
so when the dead carcass is brought back to live you can dredge a
symbol table off it and build a good bug report. It would be a
lot easier than dragging in serial plumbing for every machine that
might possibly fail.

>it seems to me like some people expect all this to be done
>for them for free.

Heavens, the nerve of some people, asking on the kernel mailing
list if the infrastructure exists to get bug reports out of
failing machines so that bugs can be reported.

____
david parsons \bi/ A ksymoops'ed panic is a lot more useful than
\/ "the system mysteriously crashed"

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/