Re: [HMM v13 08/18] mm/hmm: heterogeneous memory management (HMM for short)

From: Jerome Glisse
Date: Sun Nov 27 2016 - 08:11:06 EST

Next message: Mel Gorman: "[PATCH] mm: page_alloc: High-order per-cpu page allocator v3"
Previous message: Sergey Senozhatsky: "Re: [PATCH v3 3/3] zram: support BDI_CAP_STABLE_WRITES"
Next in thread: Anshuman Khandual: "Re: [HMM v13 08/18] mm/hmm: heterogeneous memory management (HMM for short)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Nov 23, 2016 at 09:33:35AM +0530, Anshuman Khandual wrote:
> On 11/18/2016 11:48 PM, Jérôme Glisse wrote:

[...]

> > + *
> > + * hmm_vma_migrate(vma, start, end, ops);
> > + *
> > + * With ops struct providing 2 callback alloc_and_copy() which allocated the
> > + * destination memory and initialize it using source memory. Migration can fail
> > + * after this step and thus last callback finalize_and_map() allow the device
> > + * driver to know which page were successfully migrated and which were not.
>
> So we have page->pgmap->free_devpage() to release the individual page back
> into the device driver management during migration and also we have this ops
> based finalize_and_mmap() to check on the failed instances inside a single
> migration context which can contain set of pages at a time.
>
> > + *
> > + * This can easily be use outside of HMM intended use case.
>
> Where you think this can be used outside of HMM ?

Well on the radar is new memory hierarchy that seems to be on every CPU designer
roadmap. Where you have a fast small HBM like memory package with the CPU and then
you have the regular memory.

In the embedded world they want to migrate active process to fast CPU memory and
shutdown the regular memory to save power.

In the HPC world they want to migrate hot data of hot process to this fast memory.

In both case we are talking about process base memory migration and in case of
embedded they also have DMA engine they can use to offload the copy operation
itself.

This are the useful case i have in mind but other people might see that code and
realise they could also use it for their own specific corner case.

[...]

> > +/*
> > + * hmm_pfn_t - HMM use its own pfn type to keep several flags per page
> > + *
> > + * Flags:
> > + * HMM_PFN_VALID: pfn is valid
> > + * HMM_PFN_WRITE: CPU page table have the write permission set
> > + */
> > +typedef unsigned long hmm_pfn_t;
> > +
> > +#define HMM_PFN_VALID (1 << 0)
> > +#define HMM_PFN_WRITE (1 << 1)
> > +#define HMM_PFN_SHIFT 2
> > +
> > +static inline struct page *hmm_pfn_to_page(hmm_pfn_t pfn)
> > +{
> > + if (!(pfn & HMM_PFN_VALID))
> > + return NULL;
> > + return pfn_to_page(pfn >> HMM_PFN_SHIFT);
> > +}
> > +
> > +static inline unsigned long hmm_pfn_to_pfn(hmm_pfn_t pfn)
> > +{
> > + if (!(pfn & HMM_PFN_VALID))
> > + return -1UL;
> > + return (pfn >> HMM_PFN_SHIFT);
> > +}
> > +
> > +static inline hmm_pfn_t hmm_pfn_from_page(struct page *page)
> > +{
> > + return (page_to_pfn(page) << HMM_PFN_SHIFT) | HMM_PFN_VALID;
> > +}
> > +
> > +static inline hmm_pfn_t hmm_pfn_from_pfn(unsigned long pfn)
> > +{
> > + return (pfn << HMM_PFN_SHIFT) | HMM_PFN_VALID;
> > +}
>
> Hmm, so if we use last two bits on PFN as flags, it does reduce the number of
> bits available for the actual PFN range. But given that we support maximum of
> 64TB on POWER (not sure about X86) we can live with this two bits going away
> from the unsigned long. But what is the purpose of tracking validity and write
> flag inside the PFN ?

So 2^46 so with 12bits PAGE_SHIFT we only need 34 bits for pfns value hence i
should have enough place for my flag or is unsigned long not 64bits on powerpc ?

Cheers,
Jérôme

Next message: Mel Gorman: "[PATCH] mm: page_alloc: High-order per-cpu page allocator v3"
Previous message: Sergey Senozhatsky: "Re: [PATCH v3 3/3] zram: support BDI_CAP_STABLE_WRITES"
Next in thread: Anshuman Khandual: "Re: [HMM v13 08/18] mm/hmm: heterogeneous memory management (HMM for short)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]