On 09/05/24 at 02:07pm, Sourabh Jain wrote:
Hello Baoquan,How about this?
On 05/09/24 08:53, Baoquan He wrote:
On 09/04/24 at 02:55pm, Sourabh Jain wrote:Yes, but it is very rare to encounter.
Hello Baoquan,......snip...
On 30/08/24 16:47, Baoquan He wrote:
On 08/20/24 at 12:10pm, Sourabh Jain wrote:
Hello Baoquan,
But you wanted to avoid the erroring out if it's being inThe window where kernel is holding kexec_lock to do kexec boot2. A patch to return early from the `crash_handle_hotplug_event()` functionThere's a race gap between the kexec_in_progress checking and the
if `kexec_in_progress` is
set to True. This is essentially my original patch.
setting it to true which Michael has mentioned.
but kexec_in_progress is yet not set to True.
If kernel needs to handle crash hotplug event, the function
crash_handle_hotplug_event() will not get the kexec_lock and
error out by printing error message about not able to update
kdump image.
kernel_kexec(). Now you are seeing at least one the noising
message, aren't you?
My comments on your updated code are inline below.
The above code will print an error message during the race gap. Here's why:I think it should be fine. Given that lock is already taken forAh, I meant as below, but wrote it mistakenly.
kexec kernel boot.
Am I missing something major?
That's why I thinkTry for kexec lock before kexec_in_progress check will not solve
maybe checking kexec_in_progress after failing to retriving
__kexec_lock is a little better, not very sure.
the original problem this patch trying to solve.
You proposed the below changes earlier:
- if (!kexec_trylock()) {
+ if (!kexec_trylock() && kexec_in_progress) {
pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
crash_hotplug_unlock();
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 63cf89393c6e..e7c7aa761f46 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -504,7 +504,7 @@ int crash_check_hotplug_support(void)
crash_hotplug_lock();
/* Obtain lock while reading crash information */
- if (!kexec_trylock()) {
+ if (!kexec_trylock() && !kexec_in_progress) {
pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
crash_hotplug_unlock();
return 0;
Once the kexec_in_progress is set to True there is no way one can getWith your patch, you could still get the error message if the race gap
kexec_lock. So kexec_trylock() before kexec_in_progress is not helpful
for the problem I am trying to solve.
exist. With above change, you won't get it. Please correct me if I am
wrong.
Let’s say the kexec lock is acquired in the kernel_kexec() function,
but kexec_in_progress is not yet set to True. In this scenario, the code
will print
an error message.
There is another issue I see with the above code:
Consider that the system is on the kexec kernel boot path, and
kexec_in_progress
is set to True. If crash_hotplug_unlock() is called, the kernel will not
only update
the kdump image without acquiring the kexec lock, but it will also release
the
kexec lock in the out label. I believe this is incorrect.
Please share your thoughts.
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 63cf89393c6e..8ba7b1da0ded 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -505,7 +505,8 @@ int crash_check_hotplug_support(void)
crash_hotplug_lock();
/* Obtain lock while reading crash information */
if (!kexec_trylock()) {
- pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
+ if (!kexec_in_progress)
+ pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
crash_hotplug_unlock();
return 0;
}
@@ -540,7 +541,8 @@ static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu,
crash_hotplug_lock();
/* Obtain lock while changing crash information */
if (!kexec_trylock()) {
- pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
+ if (!kexec_in_progress)
+ pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
crash_hotplug_unlock();
return;
}