On Tue, Aug 08, 2023 at 11:18:07AM -0400, Yazen Ghannam wrote:
I think it would be better to avoid dependencies between independent things.
If they really are independent then I guess. Not that it all ends up in
a twisty dependency where you wish you should've merged the two
together. So think about all deps before you design this - it needs to
handle all cases without hackery.
For example, amd_smn_read() is mostly used in amd64_edac. EDAC was the
original user of SMN accesses, and all the SMN stuff could have been
included in EDAC. However, SMN is not specifically for EDAC, so it was added
to amd_nb.c to be commonly available. Currently, SMN accesses are done in
other modules. I don't think it would have been a good idea to force other
modules or subsystems to require EDAC to be used.
What does that have to do with this? SMN access is generic and should be
in amd_nb.c as it is needed by other stuff. EDAC, RAS, whatever are all
users of that thing.
This is my reasoning for a separate, independent module for the translation.
EDAC is the first user of this. But there will be future code that can
leverage this, like CXL, and even the MCE subsystem. And, yes, mce_amd may
be already loaded, but this isn't a given. A person may want MCE and CXL
support without wanting to use EDAC.
Is that a real use case or just a hypothetical thing?
Furthermore, some things using the translation will be built-in, so the
translation module will need to be built-in.
This sounds weird.
I agree. And I don't think much of the existing things in EDAC should be
moved out. But this is new code, so there's an opportunity to have it in a
more appropriate place.
And, thinking on it more, this could be another example for future "common
RAS" functionality. Isn't that why the CEC is in drivers/ras?
It is there because it doesn't need EDAC at all. If your translation
doesn't need EDAC and EDAC is going to be only a user of it, then good.
But if you're going to have to need the MCA error decoded by EDAC and
then the error translation done by this thing, then you'd need to
synchronize between the two. I'm not saying it is impossible - it should
be well thought out first though before you go coding.
It seems like things go into EDAC because it's thought of as the de
facto RAS location. But why have something in EDAC if it doesn't
provide EDAC functionality? Other RAS things, like AER, APEI, etc.,
don't live in EDAC.
AER is part of PCI so we haven't considereed tying it into EDAC. And
there wasn't any desire to do so.
As to APEI, there's ghes_edac...