Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers

From: Mauro Carvalho Chehab
Date: Tue Apr 24 2012 - 09:12:09 EST


Em 24-04-2012 09:55, Borislav Petkov escreveu:
> On Tue, Apr 24, 2012 at 08:46:53AM -0300, Mauro Carvalho Chehab wrote:
>> Em 24-04-2012 07:40, Borislav Petkov escreveu:
>>> On Mon, Apr 23, 2012 at 06:30:54PM +0000, Mauro Carvalho Chehab wrote:
>>>>>> +};
>>>>>> +
>>>>>> +/**
>>>>>> + * struct edac_mc_layer - describes the memory controller hierarchy
>>>>>> + * @layer: layer type
>>>>>> + * @size:maximum size of the layer
>>>>>> + * @is_csrow: This layer is part of the "csrow" when old API
>>>>>> + * compatibility mode is enabled. Otherwise, it is
>>>>>> + * a channel
>>>>>> + */
>>>>>> +struct edac_mc_layer {
>>>>>> + enum edac_mc_layer_type type;
>>>>>> + unsigned size;
>>>>>> + bool is_csrow;
>>>>>> +};
>>>>>
>>>>> Huh, why do you need is_csrow? Can't do
>>>>>
>>>>> type = EDAC_MC_LAYER_CHIP_SELECT;
>>>>>
>>>>> ?
>>>>
>>>> No, that's different. For a csrow-based memory controller, is_csrow is equal to
>>>> type == EDAC_MC_LAYER_CHIP_SELECT, but, for the other memory controllers, this
>>>> is used to mark with layers will be used for the "fake csrow" exported by the
>>>> EDAC core by the legacy API.
>>>
>>> I don't understand this, do you mean: "this will be used to mark which
>>> layer will be used to fake a csrow"...?
>>
>> I've already explained this dozens of times: on x86, except for amd64_edac and
>> the drivers for legacy hardware (+7 years old), the information filled at struct
>> csrow_info is FAKE. That's basically one of the main reasons for this patchset.
>>
>> There's no csrow signals accessed by the memory controller on FB-DIMM/RAMBUS, and on DDR3
>> Intel memory controllers, it is possible to fill memories on different channels with
>> different sizes. For example, this is how the 4 DIMM banks are filled on an HP Z400
>> with a Intel W3505 CPU:
>>
>> $ ./edac-ctl --layout
>> +-----------------------------------+
>> | mc0 |
>> | channel0 | channel1 | channel2 |
>> -------+-----------------------------------+
>> slot2: | 0 MB | 0 MB | 0 MB |
>> slot1: | 1024 MB | 0 MB | 0 MB |
>> slot0: | 1024 MB | 1024 MB | 1024 MB |
>> -------+-----------------------------------+
>>
>> Those are the logs that dump the Memory Controller registers:
>>
>> [ 115.818947] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
>> [ 115.818950] EDAC DEBUG: get_dimm_config: dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>> [ 115.818955] EDAC DEBUG: get_dimm_config: dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
>> [ 115.818982] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
>> [ 115.818985] EDAC DEBUG: get_dimm_config: dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>> [ 115.819012] EDAC DEBUG: get_dimm_config: Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs
>> [ 115.819016] EDAC DEBUG: get_dimm_config: dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>>
>> The Nehalem memory controllers allow up to 3 DIMMs per channel, and has 3 channels (so,
>> a total of 9 DIMMs). Most motherboards, however, expose either 4 or 8 DIMMs per CPU,
>> so it isn't possible to have all channels and dimms filled on them.
>>
>> On this motherboard, DIMM1 to DIMM3 are mapped to the the first dimm# at channels 0 to 2, and
>> DIMM4 goes to the second dimm# at channel 0.
>>
>> See? On slot 1, only channel 0 is filled.
>
> Ok, wait a second, wait a second.
>
> It's good that you brought up an example, that will probably help
> clarify things better.
>
> So, how many physical DIMMs are we talking in the example above? 4, and
> all of them single-ranked? They must be because it says "rank: 1" above.
>
> How would the table look if you had dual-ranked or quad-ranked DIMMs on
> the motherboard?

It won't change. The only changes will be at the debug logs. It would print
something like:

EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 4 ranks, UDIMMs
EDAC DEBUG: get_dimm_config: dimm 0 1024 Mb offset: 0, bank: 8, rank: 2, row: 0x4000, col: 0x400
EDAC DEBUG: get_dimm_config: dimm 1 1024 Mb offset: 4, bank: 8, rank: 2, row: 0x4000, col: 0x400

> I understand channel{0,1,2} so what is slot now, is that the physical
> DIMM slot on the motherboard?

physical slots:
DIMM1 - at MCU channel 0, dimm slot#0
DIMM2 - at MCU channel 1, dimm slot#0
DIMM3 - at MCU channel 2, dimm slot#0
DIMM4 - at MCU channel 0, dimm slot#1

This motherboard has only 4 slots.

The i7core_edac driver is not able to discover how many physical DIMM slots
are there at the motherboard.

> If so, why are there 9 slots (3x3) when you say that most motherboards
> support 4 or 8 DIMMs per socket? Are the "slot{0,1,2}" things the
> view from the memory controller or what you physically have on the
> motherboard?

slot{0,1,2} channel{0,1,2} are the addresses given by the memory controller.
Not all motherboards add 9 DIMM physical slots though. Only high-end
motherboards provide 9 slots per MCU.

We have one Nehalem motherboard with 18 DIMM slots, and 2 CPUs. On that
machine, it is possible to use the maximum supported range of DIMMs.

>
>> Even if this memory controller would be rank-based[1], the channel
>> information can't be mapped using the legacy EDAC API, as, on the old
>> API, all channels need to be filled with memories with the same size.
>> So, this driver uses both the slot layer and the channel layer as the
>> fake csrow.
>
> So what is the slot layer, is it something you've come up with or is it
> a real DIMM slot on the motherboard?

It is the slot# inside each channel.

>> [1] As you can see from the logs and from the source code, the MC
>> registers aren't per rank, they are per DIMM. The number of ranks
>> is just one attribute of the register that describes a DIMM. The
>> MCA Error registers, however, don't map the rank when reporting an
>> errors, nor the error counters are per rank. So, while it is possible
>> to enumerate information per rank, the error detection is always per
>> DIMM.
>
> Ok.
>
> [..]
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/