Re: [PATCH 1/5] mm: Add support for unaccepted memory
From: Kirill A. Shutemov
Date: Thu Aug 12 2021 - 16:34:50 EST
On Tue, Aug 10, 2021 at 05:21:48PM +0200, David Hildenbrand wrote:
> On 10.08.21 17:02, Kirill A. Shutemov wrote:
> > On Tue, Aug 10, 2021 at 09:48:04AM +0200, David Hildenbrand wrote:
> > > On 10.08.21 08:26, Kirill A. Shutemov wrote:
> > > > UEFI Specification version 2.9 introduces concept of memory acceptance:
> > > > Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP,
> > > > requiring memory to be accepted before it can be used by the guest.
> > > > Accepting happens via a protocol specific for the Virtrual Machine
> > > > platform.
> > > >
> > > > Accepting memory is costly and it makes VMM allocate memory for the
> > > > accepted guest physical address range. It's better to postpone memory
> > > > acceptation until memory is needed. It lowers boot time and reduces
> > > > memory overhead.
> > > >
> > > > Support of such memory requires few changes in core-mm code:
> > > >
> > > > - memblock has to accept memory on allocation;
> > > >
> > > > - page allocator has to accept memory on the first allocation of the
> > > > page;
> > > >
> > > > Memblock change is trivial.
> > > >
> > > > Page allocator is modified to accept pages on the first allocation.
> > > > PageOffline() is used to indicate that the page requires acceptance.
> > > > The flag currently used by hotplug and balloon. Such pages are not
> > > > available to page allocator.
> > > >
> > > > An architecture has to provide three helpers if it wants to support
> > > > unaccepted memory:
> > > >
> > > > - accept_memory() makes a range of physical addresses accepted.
> > > >
> > > > - maybe_set_page_offline() marks a page PageOffline() if it requires
> > > > acceptance. Used during boot to put pages on free lists.
> > > >
> > > > - clear_page_offline() clears makes a page accepted and clears
> > > > PageOffline().
> > > >
> > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> > > > ---
> > > > mm/internal.h | 14 ++++++++++++++
> > > > mm/memblock.c | 1 +
> > > > mm/page_alloc.c | 13 ++++++++++++-
> > > > 3 files changed, 27 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/mm/internal.h b/mm/internal.h
> > > > index 31ff935b2547..d2fc8a17fbe0 100644
> > > > --- a/mm/internal.h
> > > > +++ b/mm/internal.h
> > > > @@ -662,4 +662,18 @@ void vunmap_range_noflush(unsigned long start, unsigned long end);
> > > > int numa_migrate_prep(struct page *page, struct vm_area_struct *vma,
> > > > unsigned long addr, int page_nid, int *flags);
> > > > +#ifndef CONFIG_UNACCEPTED_MEMORY
> > > > +static inline void maybe_set_page_offline(struct page *page, unsigned int order)
> > > > +{
> > > > +}
> > > > +
> > > > +static inline void clear_page_offline(struct page *page, unsigned int order)
> > > > +{
> > > > +}
> > > > +
> > > > +static inline void accept_memory(phys_addr_t start, phys_addr_t end)
> > > > +{
> > > > +}
> > >
> > > Can we find better fitting names for the first two? The function names are
> > > way too generic. For example:
> > >
> > > accept_or_set_page_offline()
> > >
> > > accept_and_clear_page_offline()
> >
> > Sounds good.
> >
> > > I thought for a second if
> > > PAGE_TYPE_OPS(Unaccepted, offline)
> > > makes sense as well, not sure.
> >
> > I find Offline fitting the situation. Don't see a reason to add more
> > terminology here.
> >
> > > Also, please update the description of PageOffline in page-flags.h to
> > > include the additional usage with PageBuddy set at the same time.
> >
> > Okay.
> >
> > > I assume you don't have to worry about page_offline_freeze/thaw ... as we
> > > only set PageOffline initially, but not later at runtime when other
> > > subsystems (/proc/kcore) might stumble over it.
> >
> > I think so, but I would need to look at this code once again.
> >
>
> Another thing to look into would be teaching makedumpfile via vmcoreinfo
> about these special buddy pages:
>
> makedumpfile will naturally skip all PageOffline pages and skip PageBuddy
> pages if requested to skip free pages. It detects these pages via the
> mapcount value. You will want makedumpfile to treat them like PageOffline
> pages: kernel/crash_core.c
>
> #define PAGE_BUDDY_MAPCOUNT_VALUE (~PG_buddy)
> VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);
>
> #define PAGE_OFFLINE_MAPCOUNT_VALUE (~PG_offline)
> VMCOREINFO_NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE);
>
> We could export PAGE_BUDDY_OFFLINE_MAPCOUNT_VALUE or just compute it inside
> makedumpfile from the other two values.
Thanks, for digging it up. I'll look into makedumpfile, but it's not on
top of my todo list, so may take a while.
--
Kirill A. Shutemov