100 ms boot time increase regression in acpi_init()/acpi_scan_bus()

From: Paul Menzel
Date: Mon Jan 10 2022 - 06:30:53 EST


#regzbot introduced: v5.13..v5.14-rc1
#regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=215419


Dear Linux folks,


On the Intel T4500 laptop Acer TravelMate 5735Z with Debian sid/unstable, there is a 100 ms introduced between Linux 5.10.46 and 5.13.9, and is still present in Linux 5.15.5.

[ 0.000000] microcode: microcode updated early to revision 0xa0b, date = 2010-09-28
[ 0.000000] Linux version 5.15.0-2-amd64 (debian-kernel@xxxxxxxxxxxxxxxx) (gcc-11 (Debian 11.2.0-13) 11.2.0, GNU ld (GNU Binutils for Debian) 2.37) #1 SMP Debian 5.15.5-2 (2021-12-18)
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-2-amd64 root=UUID=e17cec4f-d2b8-4cc3-bd39-39a10ed422f4 ro quiet noisapnp cryptomgr.notests random.trust_cpu=on initcall_debug log_buf_len=4M
[…]
[ 0.262243] calling acpi_init+0x0/0x487 @ 1
[…]
[ 0.281655] ACPI: Enabled 15 GPEs in block 00 to 3F
[ 0.394855] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[…]
[ 0.570908] initcall acpi_init+0x0/0x487 returned 0 after 300781 usecs

I attached all the log files to the Kernel.org Bugzilla bug report #215419 [1].

Unfortunately, I am unable to bisect the issue, as it’s not my machine, and I do not have a lot of access to it.

Using ftrace, unfortunately, I didn’t save all of them, I think the path is

acpi_init() → acpi_scan_init() → acpi_bus_scan(ACPI_ROOT_OBJECT)

But this path hasn’t changed as far as I can see. Anyway, from that path, somehow

acpi_bus_check_add_1() → acpi_bus_check_add() → … → acpi_bus_check_add() → acpi_add_single_object() → acpi_bus_get_status()

is called, and the `acpi_bus_get_status()` call takes 100 ms on the system – also the cause for bug #208705 [2] –, but that code path wasn’t taken before.

Do you know from the top of your head, what changed? I am going to have short access to the system every two weeks or so, so debugging is unfortunately quite hard.

What is already on my to-do list:

1. Use dynamic debug `drivers/acpi/scan.c`
2. Trace older Linux kernel (5.10.46) to see the differences
3. Booting some GNU/Linux system to test 5.11 (Ubuntu 20.10) and 5.12
4. Unrelated to the regression, but trace `acpi_bus_get_status()` to understand the 100 ms delay to solve bug #208705 [2]


Kind regards,

Paul


PS: Do you know of GNU/Linux live systems that are available for all Linux kernel releases and have an initrd, that just stores/uploads the output of `dmesg`?


[1]: https://bugzilla.kernel.org/show_bug.cgi?id=215419
"100 ms regression in boottime before `ACPI: PCI Root Bridge [PCI0]"
[2]: https://bugzilla.kernel.org/show_bug.cgi?id=208705
"boot performance: 100 ms delay in PCI initialization - Acer TravelMate 5735Z"