[PATCH v2] mm: Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock.

From: zhangchun
Date: Thu Jul 18 2024 - 12:17:56 EST


Very sorry to disturb! Just a friendly ping to check in on the status of the
patch "Give kmap_lock before call flush_tlb_kernel_rang,avoid kmap_high deadlock.".
Please let me know if there is any additional information from my side.

Sincerely look forward to your suggestions and guidance!

>>> >> --- a/mm/highmem.c
>>> >> +++ b/mm/highmem.c
>>> >> @@ -220,8 +220,11 @@ static void flush_all_zero_pkmaps(void)
>>> >> set_page_address(page, NULL);
>>> >> need_flush = 1;
>>> >> }
>>> >> - if (need_flush)
>>> >> + if (need_flush) {
>>> >> + unlock_kmap();
>>> >> flush_tlb_kernel_range(PKMAP_ADDR(0), PKMAP_ADDR(LAST_PKMAP));
>>> >> + lock_kmap();
>>> >> + }
>>> >> }
>>>
>>> >Why is dropping the lock like this safe? What data is it protecting
>>> >and why is it OK to leave that data unprotected here?
>>>
>>> kmap_lock is used to protect pkmap_count, pkmap_page_table and last_pkmap_nr(static variable).
>>> When call flush_tlb_kernel_range(PKMAP_ADDR(0),
>>> PKMAP_ADDR(LAST_PKMAP)), flush_tlb_kernel_range will neither modify nor read these variables. Leave that data unprotected here is safe.

>>No, the risk here is that when the lock is dropped, other threads will modify the data. And when this thread (the one running
>>flush_all_zero_pkmaps()) retakes the lock, that data may now be unexpectedly altered.

>map_new_virtual aims to find an usable entry pkmap_count[last_pkmap_nr]. When read and modify the pkmap_count[last_pkmap_nr], the kmap_lock is
>not dropped.
>"if (!pkmap_count[last_pkmap_nr])" determine pkmap_count[last_pkmap_nr] is usable or not. If unusable, try agin.

>Furthermore, the value of static variable last_pkmap_nr is stored in a local variable last_pkmap_nr, when kmap_lock is acquired,
>this is thread-safe.

>In an extreme case, if Thread A and Thread B access the same last_pkmap_nr, Thread A calls function flush_tlb_kernel_range and release the
>kmap_lock, and Thread B then acquires the kmap_lock and modifies the variable pkmap_count[last_pkmap_nr]. After Thread A completes
>the execution of function flush_tlb_kernel_range, it will check the variable pkmap_count[last_pkmap_nr].
>If pkmap_count[last_pkmap_nr] != 0, Thread A continue to call get_next_pkmap_nr and get next last_pkmap_nr.

>static inline unsigned long map_new_virtual(struct page *page)
>{
> unsigned long vaddr;
> int count;
> unsigned int last_pkmap_nr; // local variable to store static variable last_pkmap_nr
> unsigned int color = get_pkmap_color(page);

>start:
> ...
> flush_all_zero_pkmaps();// release kmap_lock, then acquire it
> count = get_pkmap_entries_count(color);
> }
> ...
> if (!pkmap_count[last_pkmap_nr]) // pkmap_count[last_pkmap_nr] is used or not
> break; /* Found a usable entry */
> if (--count)
> continue;
>
> ...
> vaddr = PKMAP_ADDR(last_pkmap_nr);
> set_pte_at(&init_mm, vaddr,
> &(pkmap_page_table[last_pkmap_nr]), mk_pte(page, kmap_prot));
>
> pkmap_count[last_pkmap_nr] = 1;
> ...
> return vaddr;
>}

--
1.8.3.1