Re: [PATCH v2] mm: Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock.

From: Andrew Morton
Date: Wed Jul 10 2024 - 13:36:47 EST


On Wed, 10 Jul 2024 20:20:28 +0800 zhangchun <zhang.chuna@xxxxxxx> wrote:

> Use kmap_high and kmap_XXX or kumap_xxx among differt cores at the same
> time may cause deadlock. The issue is like this:


What is kmap_XXX?

> CPU 0: CPU 1:
> kmap_high(){ kmap_xxx() {
> ... irq_disable();
> spin_lock(&kmap_lock)
> ...
> map_new_virtual ...
> flush_all_zero_pkmaps
> flush_tlb_kernel_range /* CPU0 holds the kmap_lock */
> smp_call_function_many spin_lock(&kmap_lock)
> ... ....
> spin_unlock(&kmap_lock)
> ...
>
> CPU 0 holds the kmap_lock, waiting for CPU 1 respond to IPI. But CPU 1
> has disabled irqs, waiting for kmap_lock, cannot answer the IPI. Fix
> this by releasing kmap_lock before call flush_tlb_kernel_range,
> avoid kmap_lock deadlock.
>
> Fixes: 3297e760776a ("highmem: atomic highmem kmap page pinning")

Wow, that's 15 years old. Has the deadlock been observed?

> --- a/mm/highmem.c
> +++ b/mm/highmem.c
> @@ -220,8 +220,11 @@ static void flush_all_zero_pkmaps(void)
> set_page_address(page, NULL);
> need_flush = 1;
> }
> - if (need_flush)
> + if (need_flush) {
> + unlock_kmap();
> flush_tlb_kernel_range(PKMAP_ADDR(0), PKMAP_ADDR(LAST_PKMAP));
> + lock_kmap();
> + }
> }

Why is dropping the lock like this safe? What data is it protecting
and why is it OK to leave that data unprotected here?