Re: [PATCH v14 13/13] x86/mm: only invalidate final translations with INVLPGB
From: Dave Hansen
Date: Mon Mar 03 2025 - 17:40:56 EST
On 2/25/25 19:00, Rik van Riel wrote:
> static inline void __invlpgb_flush_user_nr_nosync(unsigned long pcid,
> unsigned long addr,
> u16 nr,
> - bool pmd_stride)
> + bool pmd_stride,
> + bool freed_tables)
> {
> - __invlpgb(0, pcid, addr, nr, pmd_stride, INVLPGB_PCID | INVLPGB_VA);
> + u8 flags = INVLPGB_PCID | INVLPGB_VA;
> +
> + if (!freed_tables)
> + flags |= INVLPGB_FINAL_ONLY;
> +
> + __invlpgb(0, pcid, addr, nr, pmd_stride, flags);
> }
I'm not sure this is OK.
Think of a hugetlbfs mapping with shared page tables. Say you had a
1GB-sized and 1GB-aligned mapping. It might zap the one PUD that it
needs, set tlb->cleared_puds=1 but it never sets ->freed_tables because
it didn't actually free the shared page table page.
I'd honestly just throw this patch out of the series for now. All of the
other TLB invalidation that the kernel does implicitly toss out the
mid-level paging structure caches.