Re: [PATCH] acpi: Fix hed module initialization order when it is built-in

From: Xiaofei Tan
Date: Sat Dec 28 2024 - 05:24:16 EST


Hi Jonathan,

在 2024/12/24 3:33, Jonathan Cameron 写道:
On Mon, 23 Dec 2024 17:31:08 +0800
Xiaofei Tan <tanxiaofei@xxxxxxxxxx> wrote:

Hi Rafael,

在 2024/12/11 1:59, Rafael J. Wysocki 写道:
On Fri, Nov 15, 2024 at 4:56 AM Xiaofei Tan <tanxiaofei@xxxxxxxxxx> wrote:
When the module hed is built-in, the init order is determined by
Makefile order.
Are you sure?
yes
We had a similar fix in CXL recently (which is why I suggested this approach
internally when tanxiaofei mentioned the problem).

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/cxl?id=6575b268157f37929948a8d1f3bafb3d7c055bc1

The related discussion for the CXL patch was the first time I'd come across solution
to load order for built in cases.

Yes :)

That order violates expectations. Because the module
hed init is behind evged. RAS records can't be handled in the
special time window that evged has initialized while hed not.
If the number of such RAS records is more than the APEI HEST error
source number, the HEST resources could be occupied all, and then
could affect subsequent RAS error reporting.
Well, the problem is real, but does the change really prevent it from
happening or does it just increase the likelihood of success?
It can be completely solved if the driver used as built-in way. If build HED as a
module, it not solved.
Can we enforce that condition not happening with appropriate Kconfig?
It's annoying to restrict build options, but if needed to make it work
then better than not working!

Agree,  i will change ACPI_HED from tristate to bool if there are no other comments, thanks.


Jonathan


In the latter case, and generally speaking too, it would be better to
add explicit synchronization between evged and hed.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
Signed-off-by: Xiaofei Tan <tanxiaofei@xxxxxxxxxx>
---
drivers/acpi/Makefile | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index 61ca4afe83dc..54f60b7922ad 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -15,6 +15,13 @@ endif

obj-$(CONFIG_ACPI) += tables.o

+#
+# The hed.o needs to be in front of evged.o to avoid the problem that
+# RAS errors cannot be handled in the special time window of startup
+# phase that evged has initialized while hed not.
+#
+obj-$(CONFIG_ACPI_HED) += hed.o
+
#
# ACPI Core Subsystem (Interpreter)
#
@@ -95,7 +102,6 @@ obj-$(CONFIG_ACPI_HOTPLUG_IOAPIC) += ioapic.o
obj-$(CONFIG_ACPI_BATTERY) += battery.o
obj-$(CONFIG_ACPI_SBS) += sbshc.o
obj-$(CONFIG_ACPI_SBS) += sbs.o
-obj-$(CONFIG_ACPI_HED) += hed.o
obj-$(CONFIG_ACPI_EC_DEBUGFS) += ec_sys.o
obj-$(CONFIG_ACPI_BGRT) += bgrt.o
obj-$(CONFIG_ACPI_CPPC_LIB) += cppc_acpi.o
--
2.33.0
.
.