Re: 2.2.10 oops (finally, something I can report!)

Justin Ossevoort (iq-0@internetionals.demon.nl)
Thu, 1 Jul 1999 17:55:58 +0200 (CEST)


Before someone else comes saying that they had problems with a Cyrix
processor ( I don't use one myself ;-) ),
wouldn't it be logical to search the problem somewhere else, instead of
blaming it on the cpu, pherhaps we should be blaming it on the support of
the cpu. Linus has already pointed in the direction of the "comma bug",
that would pherhaps be a good clue. He thought there could be some
problems in that fix. So if some people with a (alike) Cyric cpu (having
sometimes some Oopses) could just _try_ and disable the MTRR support
and/or the "comma bug" workaround.
I don't know where this oops occurs, is it pherhaps some signal register
of the cpu that indicated this fault (on which the kernel responded) for
if this is the case, it could just be that the bug lies in the processor's
signal registers. (I don't know the actual name of those things, I just
know that they exist in some form).

On Thu, 1 Jul 1999, Steven N. Hirsch wrote:

> On Wed, 30 Jun 1999, Nate Eldredge wrote:
>
> > Linus wrote:
> > >
> > > The thing that does NOT make sense is the cause of the oops itself,
> > > though.
> > >
> > > The oops happens on
> > >
> > > c017b651 pushl %ebx
> > >
> > > and %esp = c3941e80.
> > >
> > > And quite frankly, there's not a way in h*ll that that instruction could
> > > raise the exception in question. But it does.
> > >
> > > I would _strongly_ suspect one of two things:
> > > - bad CPU.
> > > - bad cache or RAM timings.
> >
> > I had a Cyrix CPU some time back that had a *very* similar problem. I
> > believe it was running 2.0.36. Anyway, it worked absolutely fine, until
> > one day I built EGCS. This binary would, about 1/3 of the time, crash.
> > Poking around with a debugger showed that the instruction on which it
> > crashed was an access to a perfectly valid address (according to
> > /proc/xx/maps). Swapping in a different CPU (I think it was an Intel
> > Pentium) fixed it. ISTR it also could be fixed by turning off the L1
> > cache or something equally unacceptable performance-wise.
>
> I'll provide another data point on this issue. For about two years, my
> Cyrix P150+ box would crash 1 out of 3 times during kernel builds with
> spurious signal 11's. No rhyme or reason - the location was random and
> non-deterministic.
>
> Finally, after a suggestion from Alan Cox, I picked up a Pentium 166 and
> replaced the CPU. Haven't seen so much as a hiccup from the box since
> then (about 8 months now).
>
> This is almost certainly a hardware problem.
>
> Steve
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/
>

-- 
     -=( Justin Ossevoort )=-
  [iq-0@internetionals.demon.nl]

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/