Re: [PATCH 1/2] platform/x86/amd: Introduce AMD Address Translation Library
From: Yazen Ghannam
Date: Tue Aug 08 2023 - 12:07:56 EST
On 8/8/2023 10:37 AM, Borislav Petkov wrote:
On Tue, Aug 08, 2023 at 10:28:51AM -0400, Yazen Ghannam wrote:
Because this isn't intended to be only for MCA errors. The translation code
is related to the AMD Data Fabric. And it'll be a common back-end for memory
errors coming from MCA and CXL.
But EDAC is not only about memory errors. Why not extend this into
something which does other RAS functionality instead of doing a second
one which is more or less related?
mce_amd is already loaded on the system, why add a second module if it
can be part of the first one just the same?
I think it would be better to avoid dependencies between independent things.
For example, amd_smn_read() is mostly used in amd64_edac. EDAC was the
original user of SMN accesses, and all the SMN stuff could have been
included in EDAC. However, SMN is not specifically for EDAC, so it was
added to amd_nb.c to be commonly available. Currently, SMN accesses are
done in other modules. I don't think it would have been a good idea to
force other modules or subsystems to require EDAC to be used.
This is my reasoning for a separate, independent module for the
translation. EDAC is the first user of this. But there will be future
code that can leverage this, like CXL, and even the MCE subsystem. And,
yes, mce_amd may be already loaded, but this isn't a given. A person may
want MCE and CXL support without wanting to use EDAC.
Furthermore, some things using the translation will be built-in, so the
translation module will need to be built-in. And it seems unnecessary to
require all of mce_amd to be built-in just for the translation part.
Strictly speaking, this all should've been drivers/ras/ from the very
beginning and all EDAC should move there but that's going to be madness
to do now.
I agree. And I don't think much of the existing things in EDAC should be
moved out. But this is new code, so there's an opportunity to have it in
a more appropriate place.
And, thinking on it more, this could be another example for future
"common RAS" functionality. Isn't that why the CEC is in drivers/ras? It
seems like things go into EDAC because it's thought of as the de facto
RAS location. But why have something in EDAC if it doesn't provide EDAC
functionality? Other RAS things, like AER, APEI, etc., don't live in EDAC.
Thanks,
Yazen