[RFC] cpufreq: Excessive CPUFreq driver loading
From: Meyer, Kyle
Date: Thu May 06 2021 - 10:25:51 EST
Hello,
acpi-cpufreq is mutually exclusive with intel_pstate, however, acpi-cpufreq is
loaded multiple times during startup while intel_pstate is enabled.
This issue was reported to the systemd maintainers and they indicated that it
should be fixed in the kernel: https://github.com/systemd/systemd/issues/19439
During startup, the kernel triggers one uevent for each device as a result of
systemd-udev-trigger.service executing "udevadm trigger --type=subsystems
--action=add" and "udevadm trigger --type=devices --action=add". The service
exists to retrigger all devices as uevents sent by the kernel, before
systemd-udevd is running, would have been missed. When systemd-udevd receives a
uevent it matches its configured rules against the device. If a uevent's
ACTION=="add", systemd-udevd will run "kmod load $env{MODALIAS}" from
80-drivers.rules. udev's builtin kmod will then attempt to load modules
matching the device's MODALIAS.
When systemd-udevd recieves an "add" uevent from
/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0007:XXX it runs "kmod load cpu:type:x86,
...,00E8,..." as "cpu:type:x86,...,00E8,..." is that devices MODALIAS.
When systemd-udevd recieves an "add" uevent from /devices/system/cpu/cpuXXX it
runs "kmod load acpi:ACPI0007:" as "acpi:ACPI0007:" is that devices MODALIAS.
acpi-cpufreq is loaded as it matches both devices MODALIASes.
# modinfo acpi-cpufreq | grep alias
alias: acpi
alias: cpu:type:x86,ven*fam*mod*:feature:*00E8*
alias: cpu:type:x86,ven*fam*mod*:feature:*0016*
alias: acpi*:ACPI0007:*
alias: acpi*:LNXCPU:*
On a system with 1536 logical CPUs, systemd-udevd attempts to load acpi-cpufreq
3072 times.
1536 * /devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0007:XXX
1536 * /devices/system/cpu/cpuXXX
The delay, caused by systemd-udevd attempting to load the driver, has a
significant impact on the startup time. It causes some devices to be
unavailable after reaching the root login prompt as it postpones the loading of
other drivers.
Each time that the driver is loaded it returns -EEXIST from acpi_cpufreq_init.
static int __init acpi_cpufreq_init(void)
{
int ret;
if (acpi_disabled)
return -ENODEV;
/* don't keep reloading if cpufreq_driver exists */
if (cpufreq_get_current_driver())
return -EEXIST;
...
Changing the return value from -EEXIST to 0 when another driver exists prevents
the driver from being loaded multiple times as kmod won't load a "live" module.
Alternatively, blacklisting the driver (or disabling intel_pstate) prevents the
issue as well. Below are the before and after startup times.
# systemd-analyze
Startup finished in 37.939s (kernel) + 10.909s (initrd) + 3min 55.004s (userspace) = 4min 43.852s
# systemd-analyze
Startup finished in 38.307s (kernel) + 10.205s (initrd) + 38.312s (userspace) = 1min 26.826s
Thank you,
Kyle Meyer