Re: [PATCH v5 1/6] mm/page_idle: Add per-pid idle page tracking using virtual index

From: Joel Fernandes
Date: Tue Aug 13 2019 - 09:51:58 EST


On Tue, Aug 13, 2019 at 11:14:30AM +0200, Michal Hocko wrote:
> On Mon 12-08-19 10:56:20, Joel Fernandes wrote:
> > On Thu, Aug 08, 2019 at 10:00:44AM +0200, Michal Hocko wrote:
> > > On Wed 07-08-19 17:31:05, Joel Fernandes wrote:
> > > > On Wed, Aug 07, 2019 at 01:58:40PM -0700, Andrew Morton wrote:
> > > > > On Wed, 7 Aug 2019 16:45:30 -0400 Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > > On Wed, Aug 07, 2019 at 01:04:02PM -0700, Andrew Morton wrote:
> > > > > > > On Wed, 7 Aug 2019 13:15:54 -0400 "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > > In Android, we are using this for the heap profiler (heapprofd) which
> > > > > > > > profiles and pin points code paths which allocates and leaves memory
> > > > > > > > idle for long periods of time. This method solves the security issue
> > > > > > > > with userspace learning the PFN, and while at it is also shown to yield
> > > > > > > > better results than the pagemap lookup, the theory being that the window
> > > > > > > > where the address space can change is reduced by eliminating the
> > > > > > > > intermediate pagemap look up stage. In virtual address indexing, the
> > > > > > > > process's mmap_sem is held for the duration of the access.
> > > > > > >
> > > > > > > So is heapprofd a developer-only thing? Is heapprofd included in
> > > > > > > end-user android loads? If not then, again, wouldn't it be better to
> > > > > > > make the feature Kconfigurable so that Android developers can enable it
> > > > > > > during development then disable it for production kernels?
> > > > > >
> > > > > > Almost all of this code is already configurable with
> > > > > > CONFIG_IDLE_PAGE_TRACKING. If you disable it, then all of this code gets
> > > > > > disabled.
> > > > > >
> > > > > > Or are you referring to something else that needs to be made configurable?
> > > > >
> > > > > Yes - the 300+ lines of code which this patchset adds!
> > > > >
> > > > > The impacted people will be those who use the existing
> > > > > idle-page-tracking feature but who will not use the new feature. I
> > > > > guess we can assume this set is small...
> > > >
> > > > Yes, I think this set should be small. The code size increase of page_idle.o
> > > > is from ~1KB to ~2KB. Most of the extra space is consumed by
> > > > page_idle_proc_generic() function which this patch adds. I don't think adding
> > > > another CONFIG option to disable this while keeping existing
> > > > CONFIG_IDLE_PAGE_TRACKING enabled, is worthwhile but I am open to the
> > > > addition of such an option if anyone feels strongly about it. I believe that
> > > > once this patch is merged, most like this new interface being added is what
> > > > will be used more than the old interface (for some of the usecases) so it
> > > > makes sense to keep it alive with CONFIG_IDLE_PAGE_TRACKING.
> > >
> > > I would tend to agree with Joel here. The functionality falls into an
> > > existing IDLE_PAGE_TRACKING config option quite nicely. If there really
> > > are users who want to save some space and this is standing in the way
> > > then they can easily add a new config option with some justification so
> > > the savings are clear. Without that an additional config simply adds to
> > > the already existing configurability complexity and balkanization.
> >
> > Michal, Andrew, Minchan,
> >
> > Would you have any other review comments on the v5 series? This is just a new
> > interface that does not disrupt existing users of the older page-idle
> > tracking, so as such it is a safe change (as in, doesn't change existing
> > functionality except for the draining bug fix).
>
> I hope to find some more time to finish the review but let me point out
> that "it's new it is regression safe" is not really a great argument for
> a new user visible API.

Actually, I think you misunderstood me and took it out of context. I never
intended to say "it is regression safe". I meant to say it is "low risk", as
in that in all likelihood should not be hurting *existing users* of the *old
interface*. Also as you know, it has been tested.

> If the API is flawed then this is likely going
> to kick us later and will be hard to fix. I am still not convinced about
> the swap part of the thing TBH.

Ok, then let us discuss it. As I mentioned before, without this we lose the
access information due to MADVISE or swapping. Minchan and Konstantin both
suggested it that's why I also added it (other than me also realizing that it
is neeed). For x86, it uses existing bits in pte so it is not adding any more
bits. For arm64, it uses unused bits that the hardware cannot use. So I
don't see why this is an issue to you.

thanks,

- Joel