Re: [PATCH 01/12] mm/pgtable: add rcu_read_lock() and rcu_read_unlock()s

From: Jann Horn
Date: Wed May 31 2023 - 13:07:37 EST


On Mon, May 29, 2023 at 8:15 AM Hugh Dickins <hughd@xxxxxxxxxx> wrote:
> Before putting them to use (several commits later), add rcu_read_lock()
> to pte_offset_map(), and rcu_read_unlock() to pte_unmap(). Make this a
> separate commit, since it risks exposing imbalances: prior commits have
> fixed all the known imbalances, but we may find some have been missed.
[...]
> diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
> index c7ab18a5fb77..674671835631 100644
> --- a/mm/pgtable-generic.c
> +++ b/mm/pgtable-generic.c
> @@ -236,7 +236,7 @@ pte_t *__pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp)
> {
> pmd_t pmdval;
>
> - /* rcu_read_lock() to be added later */
> + rcu_read_lock();
> pmdval = pmdp_get_lockless(pmd);
> if (pmdvalp)
> *pmdvalp = pmdval;

It might be a good idea to document that this series assumes that the
first argument to __pte_offset_map() is a pointer into a second-level
page table (and not a local copy of the entry) unless the containing
VMA is known to not be THP-eligible or the page table is detached from
the page table hierarchy or something like that. Currently a bunch of
places pass references to local copies of the entry, and while I think
all of these are fine, it would probably be good to at least document
why these are allowed to do it while other places aren't.

$ vgrep 'pte_offset_map(&'
Index File Line Content
0 arch/sparc/mm/tlb.c 151 pte = pte_offset_map(&pmd, vaddr);
1 kernel/events/core.c 7501 ptep = pte_offset_map(&pmd, addr);
2 mm/gup.c 2460 ptem = ptep = pte_offset_map(&pmd, addr);
3 mm/huge_memory.c 2057 pte = pte_offset_map(&_pmd, haddr);
4 mm/huge_memory.c 2214 pte = pte_offset_map(&_pmd, haddr);
5 mm/page_table_check.c 240 pte_t *ptep = pte_offset_map(&pmd, addr);