Re: [PATCH 0/3] Allow user to request memory to be locked on page fault

From: Eric B Munson
Date: Mon May 11 2015 - 10:37:29 EST

On Fri, 08 May 2015, Andrew Morton wrote:

> On Fri, 8 May 2015 16:06:10 -0400 Eric B Munson <emunson@xxxxxxxxxx> wrote:
> > On Fri, 08 May 2015, Andrew Morton wrote:
> >
> > > On Fri, 8 May 2015 15:33:43 -0400 Eric B Munson <emunson@xxxxxxxxxx> wrote:
> > >
> > > > mlock() allows a user to control page out of program memory, but this
> > > > comes at the cost of faulting in the entire mapping when it is
> > > > allocated. For large mappings where the entire area is not necessary
> > > > this is not ideal.
> > > >
> > > > This series introduces new flags for mmap() and mlockall() that allow a
> > > > user to specify that the covered are should not be paged out, but only
> > > > after the memory has been used the first time.
> > >
> > > Please tell us much much more about the value of these changes: the use
> > > cases, the behavioural improvements and performance results which the
> > > patchset brings to those use cases, etc.
> > >
> >
> > The primary use case is for mmaping large files read only. The process
> > knows that some of the data is necessary, but it is unlikely that the
> > entire file will be needed. The developer only wants to pay the cost to
> > read the data in once. Unfortunately developer must choose between
> > allowing the kernel to page in the memory as needed and guaranteeing
> > that the data will only be read from disk once. The first option runs
> > the risk of having the memory reclaimed if the system is under memory
> > pressure, the second forces the memory usage and startup delay when
> > faulting in the entire file.
> Why can't the application mmap only those parts of the file which it
> wants and mlock those?

There are a number of problems with this approach. The first is it
presumes the program will know what portions are needed a head of time.
In many cases this is simply not true. The second problem is the number
of syscalls required. With my patches, a single mmap() or mlockall()
call is needed to setup the required locking. Without it, a separate
mmap call must be made for each piece of data that is needed. This also
opens up problems for data that is arranged assuming it is contiguous in
memory. With the single mmap call, the user gets a contiguous VMA
without having to know about it. mmap() with MAP_FIXED could address
the problem, but this introduces a new failure mode of your map
colliding with another that was placed by the kernel.

Another use case for the LOCKONFAULT flag is the security use of
mlock(). If an application will be using data that cannot be written
to swap, but the exact size is unknown until run time (all we have a
build time is the maximum size the buffer can be). The LOCKONFAULT flag
allows the developer to create the buffer and guarantee that the
contents are never written to swap without ever consuming more memory
than is actually needed.

> > I am working on getting startup times with and without this change for
> > an application, I will post them as soon as I have them.

Attachment: signature.asc
Description: Digital signature