Re: [patch] SMP alternatives
From: thockin
Date: Thu Nov 24 2005 - 14:22:44 EST
On Thu, Nov 24, 2005 at 08:14:46PM +0100, Andi Kleen wrote:
> > I'm curious about that too. Even with k8 you can get down to a
> > chip-select, but that doesn't necessarily map to a DIMM in any useful way,
> > unless you have some mobo knowledge. Are we going to need a new BIOS
>
> Yeah that's my problem.
>
> It's not theoretical. We had cases where someone had to go
> through 10+ DIMMs on a big machine in try and error to find
> out which one is wrong. Very bad situation.
I have the exact same problem right now. Part of our early bootup we run
a simplish memory test. Basically it's a "can the memory hold state"
test. If anything fails, we have to identify as exactly as possible WHICH
DIMM needs to be replaced, so the hardware ops people can do it at
assembly/test time.
We implemented AMD's reference algorithm, and made it work in the presence
of a hardware IO hole. It seems to work beautifully, but the last step is
turning a (node:chip-select) into a (node:dimm). Simple boards will use
simple mappings, but we can't know that without board specific info.
Especially with quad-rank DIMMs. :)
> > table to map chip-selects onto DIMMs? :)
>
> I proposed something like that - best with an ASCII string
> ("First DIMM on the top left corner") But getting such stuff into BIOS
> is difficult and long winded.
It would be easy enough to get into LinuxBIOS. :)
Seriously, this is work that is *long* overdue. I have been wanting to
look at this for over a year, but I have not had time.
Doing proper architecture and chipset-specific ECC/error handling which
ties into a bigger abstracted error system is going to be really nice.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/