Re: [PATCH] modules: Only return -EEXIST for modules that have finished loading

From: Prarit Bhargava
Date: Mon Apr 15 2019 - 08:04:09 EST




On 4/15/19 7:23 AM, Jessica Yu wrote:
> +++ Prarit Bhargava [02/04/19 09:39 -0400]:
>> Microsoft HyperV disables the X86_FEATURE_SMCA bit on AMD systems, and
>> linux guests boot with repeated errors:
>>
>> amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2)
>> amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2)
>> amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
>> amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2)
>> amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2)
>> amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
>>
>> The warnings occur because the module code erroneously returns -EEXIST
>> for modules that have failed to load and are in the process of being
>> removed from the module list.
>>
>> module amd64_edac_mod has a dependency on module edac_mce_amd. Using
>> modules.dep, systemd will load edac_mce_amd for every request of
>> amd64_edac_mod. When the edac_mce_amd module loads, the module has
>> state MODULE_STATE_UNFORMED and once the module load fails and the state
>> becomes MODULE_STATE_GOING. Another request for edac_mce_amd module
>> executes and add_unformed_module() will erroneously return -EEXIST even
>> though the previous instance of edac_mce_amd has MODULE_STATE_GOING.
>> Upon receiving -EEXIST, systemd attempts to load amd64_edac_mod, which
>> fails because of unknown symbols from edac_mce_amd.
>>
>> add_unformed_module() must wait to return for any case other than
>> MODULE_STATE_LIVE to prevent a race between multiple loads of
>> dependent modules.
>>
>> Signed-off-by: Prarit Bhargava <prarit@xxxxxxxxxx>
>> Reported-by: Cathy Avery <cavery@xxxxxxxxxx>
>> Cc: Jessica Yu <jeyu@xxxxxxxxxx>
>
> Applied to modules-next. Thanks Prarit!

Jessica, could I have the URL of the git tree?

Thanks,

P.

>
> Jessica
>