Re: [PATCH v5 02/20] EDAC/synopsys: Fix generic device type detection procedure

From: Borislav Petkov
Date: Tue Jun 04 2024 - 14:38:54 EST


On Thu, Feb 22, 2024 at 09:12:47PM +0300, Serge Semin wrote:
> First of all the enum dev_type constants describe the memory DRAM chips
> used at the stick, not the entire DQ-bus width (see the enumeration kdoc

Which kdoc?

The kernel doc above enum dev_type in include/linux/edac.h?

In any case, you need to be precise pls.

> for details). So what is returned from the zynqmp_get_dtype() function and
> then specified to the dimm_info->dtype field is definitely incorrect.

Whoops, you lost me here. Why is it incorrect?

You want

"zynqmp_get_dtype - Return the controller memory width."

to return the memory width supported by the controller?

dimm->dtype = p_data->get_dtype(priv->baseaddr);

enum dev_type dtype; /* memory device type */

Yeah, no, that function returns the DIMM device type.

/me looks at the code.

Aha, so you mean the device width should be determined from that
DDRC_MSTR_CFG* thing.

> Secondly the DRAM chips type has nothing to do with the data bus width
> specified in the MSTR.data_bus_width CSR field. That CSR field just
> determines the part of the whole DQ-bus currently used to access the data
> from the all DRAM memory chips. So it doesn't indicate the individual
> chips type. Thirdly the DRAM chips type can be determined only in case of
> the DDR4 protocol by means of the MSTR.device_config field state (it is

Hold on, this driver runs on all kinds of hardware I presume. Are you
thinking about older ones which don't do DDR4?

Or does that thing do DDR4 only?

> supposed to be set by the system firmware). Finally the DW uMCTL2 DDRC ECC
> capability doesn't depend on the memory chips type. Moreover it doesn't
> depend on the utilized data bus width in runtime either. The IP-core
> reference manual says in [1,2] that the ECC support can't be enabled
> during the IP-core synthesizes for the DRAM data bus widths other than 16,

This sentence is missing something.

> 32 or 64. At the same time the bus width mode (MSTR.data_bus_width)
> doesn't change the ECC feature availability. Thus it was wrong to
> determine the ECC state with respect to the DQ-bus width mode.

You need to split your paragraphs with newlines to help readability.
Right now it is a blob of hard to parse text. For example, when you have
to write "Secondly, " that's your split right there. "Thirdly," is your
next newline. And so on.

> Fix all of the mistakes described above in the zynqmp_get_dtype() and
> zynqmp_get_ecc_state() methods: specify actual DRAM chips data width only
> for the DDR4 protocol and return that it's UNKNOWN in the rest of the
> cases;

What are the rest of the cases and why is it ok to return UNKNOWN all of
a sudden? IOW, how was the old code even tested?!

> determine ECC availability by the ECCCFG0.ecc_mode field state
> only (that field can't be modified anyway if the IP-core was synthesized
> with no ECC support).
>
> [1] DesignWare® Cores Enhanced Universal DDR Memory Controller (uMCTL2)
> Databook, Version 3.91a, October 2020, p. 421.
> [2] DesignWare® Cores Enhanced Universal DDR Memory Controller (uMCTL2)
> Databook, Version 3.91a, October 2020, p. 633.

Can those be freely accessed?

If not, you should say so.

> Fixes: b500b4a029d5 ("EDAC, synopsys: Add ECC support for ZynqMP DDR controller")

So this commit is in 4.20.

Does that mean that this fix needs to get backported to all stable
kernels?

Have you tested this on all hw this driver supports and made sure no
regressions are introduced?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette