Re: [PATCH v2 4/5] kdump: wait for DMA to finish when using CMA
From: Baoquan He
Date: Sun Mar 02 2025 - 21:03:14 EST
On 02/20/25 at 05:55pm, Jiri Bohac wrote:
> When re-using the CMA area for kdump there is a risk of pending DMA into
> pinned user pages in the CMA area.
>
> Pages that are pinned long-term are migrated away from CMA, so these are not a
> concern. Pages pinned without FOLL_LONGTERM remain in the CMA and may possibly
> be the source or destination of a pending DMA transfer.
>
> Although there is no clear specification how long a page may be pinned without
> FOLL_LONGTERM, pinning without the flag shows an intent of the caller to
> only use the memory for short-lived DMA transfers, not a transfer initiated
> by a device asynchronously at a random time in the future.
>
> Add a delay of CMA_DMA_TIMEOUT_MSEC milliseconds before starting the kdump
> kernel, giving such short-lived DMA transfers time to finish before the CMA
> memory is re-used by the kdump kernel.
>
> Set CMA_DMA_TIMEOUT_MSEC to 1000 (one second) - chosen arbitrarily as both a
> huge margin for a DMA transfer, yet not increasing the kdump time
> significantly.
>
> Signed-off-by: Jiri Bohac <jbohac@xxxxxxx>
> ---
> include/linux/crash_core.h | 5 +++++
> kernel/crash_core.c | 10 ++++++++++
> 2 files changed, 15 insertions(+)
>
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 44305336314e..543e4a71f13c 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -56,6 +56,11 @@ static inline unsigned int crash_get_elfcorehdr_size(void) { return 0; }
> /* Alignment required for elf header segment */
> #define ELF_CORE_HEADER_ALIGN 4096
>
> +/* Time to wait for possible DMA to finish before starting the kdump kernel
> + * when a CMA reservation is used
> + */
> +#define CMA_DMA_TIMEOUT_MSEC 1000
> +
> extern int crash_exclude_mem_range(struct crash_mem *mem,
> unsigned long long mstart,
> unsigned long long mend);
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 078fe5bc5a74..543e509b7926 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -21,6 +21,7 @@
> #include <linux/reboot.h>
> #include <linux/btf.h>
> #include <linux/objtool.h>
> +#include <linux/delay.h>
>
> #include <asm/page.h>
> #include <asm/sections.h>
> @@ -97,6 +98,14 @@ int kexec_crash_loaded(void)
> }
> EXPORT_SYMBOL_GPL(kexec_crash_loaded);
>
> +static void crash_cma_clear_pending_dma(void)
> +{
> + if (!crashk_cma_cnt)
> + return;
> +
> + mdelay(CMA_DMA_TIMEOUT_MSEC);
> +}
> +
> /*
> * No panic_cpu check version of crash_kexec(). This function is called
> * only when panic_cpu holds the current CPU number; this is the only CPU
> @@ -116,6 +125,7 @@ void __noclone __crash_kexec(struct pt_regs *regs)
> if (kexec_crash_image) {
> struct pt_regs fixed_regs;
>
> + crash_cma_clear_pending_dma();
This could be too ideal, I am not sure if it's a good way. When crash
triggered, we need do the urgent and necessary thing as soon as
possible, then shutdown all CPU to avoid further damage. This one second
of waiting could give the strayed system too much time. My personal
opinion.
> crash_setup_regs(&fixed_regs, regs);
> crash_save_vmcoreinfo();
> machine_crash_shutdown(&fixed_regs);
>
> --
> Jiri Bohac <jbohac@xxxxxxx>
> SUSE Labs, Prague, Czechia
>