Re: [RFC V2] mm:add zero_page _mapcount when mapped into user space
From: Kirill A. Shutemov
Date: Thu Dec 04 2014 - 07:28:56 EST
On Thu, Dec 04, 2014 at 02:10:53PM +0800, Wang, Yalin wrote:
> > -----Original Message-----
> > From: Kirill A. Shutemov [mailto:kirill@xxxxxxxxxxxxx]
> > Sent: Tuesday, December 02, 2014 7:30 PM
> > To: Wang, Yalin
> > Cc: 'linux-kernel@xxxxxxxxxxxxxxx'; 'linux-mm@xxxxxxxxx'; 'linux-arm-
> > kernel@xxxxxxxxxxxxxxxxxxx'
> > Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user
> > space
> > On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote:
> > > This patch add/dec zero_page's _mapcount to make sure the mapcount is
> > > correct for zero_page, so that when read from /proc/kpagecount,
> > > zero_page's mapcount is also correct, userspace process like procrank
> > > can calculate PSS correctly.
> > I don't have specific code path to point to, but I would expect zero page
> > with non-zero mapcount would cause a problem with rmap.
> > How do you test the change?
> I just test it to see the mapcount from /proc/pid/pagemap and /proc/kpagecount ,
> It works well,
I took a closer look and your patch is broken in multiple places:
- on zap_pte_range() you don't decrement mapcount;
- you don't update rss counters for mm;
- copy_one_pte() doesn't increase mapcount;
Basically, each and every vm_normal_page() call must be audited. As first
step. And you totally skip huge zero page.
Proper mapcount handling for zero page would require a lot more work and I
don't think it worth it. Gain is too small.
> The problem is that when I see /proc/pid/smaps ,
> The Rss / Pss don't calculate zero_page map,
> Because smaps_pte_entry() --> vm_normal_page( ),
> Will return NULL for zero_page,
> But when userspace process cat /proc/pid/pagemap ,
> It will see zero_page mapped,
> And will treat as Rss ,
> This is weird, should we also omit zero_page in /proc/pid/pagemap ?
> Or add zero_page as Rss in /proc/pid/smaps ?
> I think we should add zero_page into Rss ,
> Because it is really mapped into userspace address space.
> And will let userspace memory analysis more accurate .
It would be easier for userspace to find out pfn of zero page and take it
Note: some architectures have multiple zero page due to coloring.
Kirill A. Shutemov
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/