Re: [PATCH 2/2] edac: add support for Amazon's Annapurna Labs EDAC

From: Shenhar, Talel
Date: Thu Jun 06 2019 - 07:42:08 EST



Disagree. The various drivers don't depend on each other.
I think we should keep the drivers separated as they are distinct and independent IP blocks.
But they don't exist in isolation, they both depend on the integration-choices/firmware
that makes up your platform.

Other platforms may have exactly the same IP blocks, configured differently, or with
different features enabled in firmware. This means we can't just probe the driver based on
the presence of the IP block, we need to know the integration choices and firmware
settings match what the driver requires.

(Case in point, that A57 ECC support is optional, another A57 may not have it)

Descriptions of what firmware did don't really belong in the DT. Its not a hardware property.

This is why its better to probe this stuff based on the machine-compatible/platform-name,
not the presence of the IP block in the DT.


Will either of your separate drivers ever run alone? If they're probed from the same
machine-compatible this won't happen.


How does your memory controller report errors? Does it send back some data with an invalid
checksum, or a specific poison/invalid flag? Will the cache report this as a cache error
too, if its an extra signal, does the cache know what it is?

All these are integration choices between the two IP blocks, done as separate drivers we
don't have anywhere to store that information. Even if you don't care about this, making
them separate drivers should only be done to make them usable on other platforms, where
these choices may have been different.

James,

Thanks for the prompt responses.

From our perspective, l1/l2 has nothing to do with the ddr memory controller.

Its right that they both use same edac subsystem but they are using totally different APIs of it.

We also even want to have separate control for enabling/disabling l1/l2 edac vs memory controller edac.

Even from technical point-of-view L1/L2 UE collection method is totally different from collecting memory-controller UE. (CPU exception vs actual interrupts).

So there is less reason why to combine them vs giving each one its own file, e.g. al_mc_edac, al_l1_l2_edac (I even don't see why Hanna combined l1 and l2...)

As we don't have any technical relation between the two we would rather avoid this combination.

Also, Lets assume we have different setups with different memory controllers, having a dt binding to control the difference is super easy and flexible.

Would having a dedicated folder for amazon ease the move to separate files?

Thanks,

Talel.


Thanks,

James