Re: 2.6.8-rc1: Possible SCSI-related problem on dual Opteron w/ NUMA

From: R. J. Wysocki
Date: Sat Jul 17 2004 - 14:01:23 EST


On Saturday 17 of July 2004 20:12, Andi Kleen wrote:
> On Sat, Jul 17, 2004 at 06:26:03PM +0200, R. J. Wysocki wrote:
> > I got this on a dual Opteron system on 2.6.8-rc1 with the latest x86-64
> > patchset from Andi:
>
> Does it happen with x86_64-2.6.8-1 too ?

It did not happen when I was running that kernel, but I had only run it for a
couple of times. It generally does not happen with this one either. It's
happened only once and I reported it immediately. But ...

... I saw something very similar on 2.6.7-rc1 and I'm now looking at the log
(attached - it's partially corrupted, because /var was on /dev/sdb5 that
failed too). The hardware configuration was similar to the current one,
AFAIR.

Well, it looks like the whole SCSI bus sometimes goes south for some reason
and it may very well be a hardware problem that manifests itself in such a
(strange?) way. Or not. Anyway, it certainly had not happened on this
hardware _before_ 2.6.7-rc1.

I have no idea what to do to make it happen again (suggestions welcome).

rjw

--
Rafael J. Wysocki
----------------------------
For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.
-- Richard P. Feynman

Attachment: 2.6.7-rc1.log.gz
Description: GNU Zip compressed data