Re: [PATCH -next] crash: fix x86_32 memory reserve dead loop retry bug at "high"

From: Jinjie Ruan
Date: Wed Jul 17 2024 - 21:21:00 EST




On 2024/7/17 21:38, Baoquan He wrote:
> On 07/17/24 at 03:09pm, Jinjie Ruan wrote:
>> Similar to commit 8f9dade5906a ("crash: fix x86_32 memory reserve dead loop
>> retry bug") and in the symmetry case, on x86_32 Qemu machine with
>> 1GB memory, the cmdline "crashkernel=512M" will also cause system stall
>> as below:
>>
>> ACPI: Reserving FACP table memory at [mem 0x3ffe18b8-0x3ffe192b]
>> ACPI: Reserving DSDT table memory at [mem 0x3ffe0040-0x3ffe18b7]
>> ACPI: Reserving FACS table memory at [mem 0x3ffe0000-0x3ffe003f]
>> ACPI: Reserving APIC table memory at [mem 0x3ffe192c-0x3ffe19bb]
>> ACPI: Reserving HPET table memory at [mem 0x3ffe19bc-0x3ffe19f3]
>> ACPI: Reserving WAET table memory at [mem 0x3ffe19f4-0x3ffe1a1b]
>> 143MB HIGHMEM available.
>> 879MB LOWMEM available.
>> mapped low ram: 0 - 36ffe000
>> low ram: 0 - 36ffe000
>> (stall here)
>>
>> The reason is that the CRASH_ADDR_LOW_MAX is equal to CRASH_ADDR_HIGH_MAX
>> on x86_32, the first "low" crash kernel memory reservation for 512M fails,
>> then it go into the "retry" loop and never came out as below (consider
>> CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX = 512M):
>>
>> -> reserve_crashkernel_generic() and high is false
>> -> alloc at [0, 0x20000000] fail
>> -> alloc at [0x20000000, 0x20000000] fail and repeatedly
>> (because CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX).
>>
>> Fix it by also changing the another out check condition, the fixed base
>> situation has no problem because it warn out if it fail to alloc.
>>
>> After this patch, it prints:
>> cannot allocate crashkernel (size:0x20000000)
>>
>> Fixes: 9c08a2a139fe ("x86: kdump: use generic interface to simplify crashkernel reservation code")
>> Signed-off-by: Jinjie Ruan <ruanjinjie@xxxxxxxxxx>
>> ---
>> kernel/crash_reserve.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
>> index 03e455738e75..36c13cf942f4 100644
>> --- a/kernel/crash_reserve.c
>> +++ b/kernel/crash_reserve.c
>> @@ -409,7 +409,7 @@ void __init reserve_crashkernel_generic(char *cmdline,
>> * low memory, fall back to high memory, the minimum required
>> * low memory will be reserved later.
>> */
>> - if (!high && search_end == CRASH_ADDR_LOW_MAX) {
>> + if (!high && !search_base) {
>
> Hmm, this may not be good. We can't guarantee that CRASH_ADDR_LOW_MAX must
> not be 0. I still suggest you testing below draft patch to see if it works
> well. And we should revert the patch in Andrew's tree since it's not good.
> Posting like these mess will confuse people and add difficulty when
> backporting.

OK,let me get this straight and test your draft patches, if it is ok,
I'll send them sooner.

>
> You haven't responded to my earlier request to test those two draft
> patches. When you tested below code and it's good, you can post this as
> a formal patch. So my suggestion to the whole work is:
>
> 1) revert commit 8f9dade5906a in Andrew's tree;
>
> 2) post two patches I suggested to prevert crashkernel=,high for 32bit
> system, and fix the issue you found;
>
> 3) post patchset to make arm32 use generic crashkernel reservation.

Sorry, I didn't quite understand your real intentions before. Thanks for
the suggestion. I'll retest and submit the patch based on your suggestion.

>
> diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
> index 5b2722a93a48..ac087ba442cd 100644
> --- a/kernel/crash_reserve.c
> +++ b/kernel/crash_reserve.c
> @@ -414,7 +414,8 @@ void __init reserve_crashkernel_generic(char *cmdline,
> search_end = CRASH_ADDR_HIGH_MAX;
> search_base = CRASH_ADDR_LOW_MAX;
> crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> - goto retry;
> + if (search_base != search_end)
> + goto retry;
> }
>
> /*
>
>