Re: [PATCH 3/8] x86/mm/pat: Restore large pages after fragmentation

From: Kirill A. Shutemov
Date: Fri Jan 10 2025 - 05:37:30 EST


On Fri, Dec 27, 2024 at 09:28:20AM +0200, Mike Rapoport wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
>
> Change of attributes of the pages may lead to fragmentation of direct
> mapping over time and performance degradation as result.
>
> With current code it's one way road: kernel tries to avoid splitting
> large pages, but it doesn't restore them back even if page attributes
> got compatible again.
>
> Any change to the mapping may potentially allow to restore large page.
>
> Hook up into cpa_flush() path to check if there's any pages to be
> recovered in PUD_SIZE range around pages we've just touched.
>
> CPUs don't like[1] to have to have TLB entries of different size for the
> same memory, but looks like it's okay as long as these entries have
> matching attributes[2]. Therefore it's critical to flush TLB before any
> following changes to the mapping.
>
> Note that we already allow for multiple TLB entries of different sizes
> for the same memory now in split_large_page() path. It's not a new
> situation.
>
> set_memory_4k() provides a way to use 4k pages on purpose. Kernel must
> not remap such pages as large. Re-use one of software PTE bits to
> indicate such pages.
>
> [1] See Erratum 383 of AMD Family 10h Processors
> [2] https://lore.kernel.org/linux-mm/1da1b025-cabc-6f04-bde5-e50830d1ecf0@xxxxxxx/
>
> [rppt@xxxxxxxxxx:
> * s/restore/collapse/
> * update formatting per peterz
> * use 'struct ptdesc' instead of 'struct page' for list of page tables to
> be freed
> * try to collapse PMD first and if it succeeds move on to PUD as peterz
> suggested
> * flush TLB twice: for changes done in the original CPA call and after
> collapsing of large pages
> ]
>
> Link: https://lore.kernel.org/all/20200416213229.19174-1-kirill.shutemov@xxxxxxxxxxxxxxx
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> Co-developed-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx>
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@xxxxxxxxxx>

When I originally attempted this, the patch was dropped because of
performance regressions. Was it addressed somehow?

--
Kiryl Shutsemau / Kirill A. Shutemov