Re: [PATCH] modules: Only return -EEXIST for modules that have finished loading

From: Jessica Yu
Date: Mon Apr 15 2019 - 09:20:50 EST


+++ Prarit Bhargava [15/04/19 08:04 -0400]:


On 4/15/19 7:23 AM, Jessica Yu wrote:
+++ Prarit Bhargava [02/04/19 09:39 -0400]:
Microsoft HyperV disables the X86_FEATURE_SMCA bit on AMD systems, and
linux guests boot with repeated errors:

amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2)
amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2)
amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2)
amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2)
amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)

The warnings occur because the module code erroneously returns -EEXIST
for modules that have failed to load and are in the process of being
removed from the module list.

module amd64_edac_mod has a dependency on module edac_mce_amd.  Using
modules.dep, systemd will load edac_mce_amd for every request of
amd64_edac_mod.  When the edac_mce_amd module loads, the module has
state MODULE_STATE_UNFORMED and once the module load fails and the state
becomes MODULE_STATE_GOING.  Another request for edac_mce_amd module
executes and add_unformed_module() will erroneously return -EEXIST even
though the previous instance of edac_mce_amd has MODULE_STATE_GOING.
Upon receiving -EEXIST, systemd attempts to load amd64_edac_mod, which
fails because of unknown symbols from edac_mce_amd.

add_unformed_module() must wait to return for any case other than
MODULE_STATE_LIVE to prevent a race between multiple loads of
dependent modules.

Signed-off-by: Prarit Bhargava <prarit@xxxxxxxxxx>
Reported-by: Cathy Avery <cavery@xxxxxxxxxx>
Cc: Jessica Yu <jeyu@xxxxxxxxxx>

Applied to modules-next. Thanks Prarit!

Jessica, could I have the URL of the git tree?

Sure, you can find the modules-next branch at:

git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux.git

Thanks,

Jessica