Re: [PATCH v3 0/4] Split page_type out from mapcount

From: Martin Schwidefsky
Date: Thu Mar 01 2018 - 09:47:43 EST


On Thu, 1 Mar 2018 15:44:12 +0300
"Kirill A. Shutemov" <kirill@xxxxxxxxxxxxx> wrote:

> On Thu, Mar 01, 2018 at 08:17:50AM +0100, Martin Schwidefsky wrote:
> > On Wed, 28 Feb 2018 14:31:53 -0800
> > Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> >
> > > From: Matthew Wilcox <mawilcox@xxxxxxxxxxxxx>
> > >
> > > I want to use the _mapcount field to record what a page is in use as.
> > > This can help with debugging and we can also expose that information to
> > > userspace through /proc/kpageflags to help diagnose memory usage (not
> > > included as part of this patch set).
> > >
> > > First, we need s390 to stop using _mapcount for its own purposes;
> > > Martin, I hope you have time to look at this patch. I must confess I
> > > don't quite understand what the different bits are used for in the upper
> > > nybble of the _mapcount, but I tried to replicate what you were doing
> > > faithfully.
> >
> > Yeah, that is a nasty bit of code. On s390 we have 2K page tables (pte)
> > but 4K pages. If we use full pages for the pte tables we waste 2K of
> > memory for each of the tables. So we allocate 4K and split it into two
> > 2K pieces. Now we have to keep track of the pieces to be able to free
> > them again.
>
> Have you considered to use slab for page table allocation instead?
> IIRC some architectures practice this already.

Well there is a complication with KVM and the page table management for
gmaps. If mm_alloc_pgste(mm) == true then a 4K page page table has to be
allocated. For the gmap I need a place to store an 8 byte value, currently
we use page->index. But the slab/slub code uses page->index for its own
purpose. This creates a conflict, but maybe doing a get_free_page for
mm_alloc_pgste(mm) == true and using a slab cache for normal page tables
might work.

--
blue skies,
Martin.

"Reality continues to ruin my life." - Calvin.