Re: [RESEND PATCH V2 0/3] Allow user to request memory to be locked on page fault

From: Eric B Munson
Date: Thu Jun 11 2015 - 15:21:57 EST


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 06/10/2015 05:59 PM, Andrew Morton wrote:
> On Wed, 10 Jun 2015 09:26:47 -0400 Eric B Munson
> <emunson@xxxxxxxxxx> wrote:
>
>> mlock() allows a user to control page out of program memory, but
>> this comes at the cost of faulting in the entire mapping when it
>> is
>
> s/mapping/locked area/

Done.

>
>> allocated. For large mappings where the entire area is not
>> necessary this is not ideal.
>>
>> This series introduces new flags for mmap() and mlockall() that
>> allow a user to specify that the covered are should not be paged
>> out, but only after the memory has been used the first time.
>
> The comparison with MCL_FUTURE is hiding over in the 2/3 changelog.
> It's important so let's copy it here.
>
> : MCL_ONFAULT is preferrable to MCL_FUTURE for the use cases
> enumerated : in the previous patch becuase MCL_FUTURE will behave
> as if each mapping : was made with MAP_LOCKED, causing the entire
> mapping to be faulted in : when new space is allocated or mapped.
> MCL_ONFAULT allows the user to : delay the fault in cost of any
> given page until it is actually needed, : but then guarantees that
> that page will always be resident.

Done

>
> I *think* it all looks OK. I'd like someone else to go over it
> also if poss.
>
>
> I guess the 2/3 changelog should have something like
>
> : munlockall() will clear MCL_ONFAULT on all vma's in the process's
> VM.

Done

>
> It's pretty obvious, but the manpage delta should make this clear
> also.

Done

>
>
> Also the changelog(s) and manpage delta should explain that
> munlock() clears MCL_ONFAULT.

Done

>
> And now I'm wondering what happens if userspace does
> mmap(MAP_LOCKONFAULT) and later does munlock() on just part of
> that region. Does the vma get split? Is this tested? Should also
> be in the changelogs and manpage.
>
> Ditto mlockall(MCL_ONFAULT) followed by munlock(). I'm not sure
> that even makes sense but the behaviour should be understood and
> tested.

I have extended the kselftest for lock-on-fault to try both of these
scenarios and they work as expected. The VMA is split and the VM
flags are set appropriately for the resulting VMAs.

>
>
> What's missing here is a syscall to set VM_LOCKONFAULT on an
> arbitrary range of memory - mlock() for lock-on-fault. It's a
> shame that mlock() didn't take a `mode' argument. Perhaps we
> should add such a syscall - that would make the mmap flag unneeded
> but I suppose it should be kept for symmetry.

Do you want such a system call as part of this set? I would need some
time to make sure I had thought through all the possible corners one
could get into with such a call, so it would delay a V3 quite a bit.
Otherwise I can send a V3 out immediately.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQIcBAEBAgAGBQJVed+3AAoJELbVsDOpoOa9eHwP+gO8QmNdUKN55wiTLxXdFTRo
TTm62MJ3Yk45+JJ+8xI1POMSUVEBAX7pxnL8TpNPmwp+UF6IQT/hAnnEFNud8/aQ
5bAxU9a5fRO6Q5533woaVpYfXZXwXAla+37MGQziL7O0VEi2aQ9abX7AKnkjmXwq
e1Fc3vutAycNCzSxg42GwZxqHw83TYztyv3C4Cc7lShbCezABYvaDvXcUZkGwhjG
KJxSPYS2E0nv0MEy995P0L0H1A/KHq6mCOFFKQw6aVbPDs8J/0RhvQIlp/BBCPMV
TqDVxMBpTpdWs6reJnUZpouKBTA11KTvUA2HBVn5B14u2V7Np+NBpLKH2DUqAP2v
Gyg4Nj0MknqB1rutaBjHjI0ZefrWK5o+zWAVKZs+wtq9WkmCvTYWp505XnlJO+qo
1CEnab2kX8P74UYcsJUrJxAtxc94t6oLh305KnJheQUdcx/ZNKboB2vl1+np10jj
oZLmP2RfajZoPojPZ/bI6mj9Ffqf/Ptau+kLQ56G1IuVmQRi4ZgQ9D1+BILXyKHi
uycKovcHVffiQ+z1Ama2b4wP1t5yjNdxBH0oV1KMeScCxfyYHPFuDBe36Krjo8FO
dDMyibNIRJMX6SeYNIRni40Eafon5h21I95/yWxUaq0FGBZ1NuuSTofxAA53wJJz
f0FUI7f53Oxk9EKk8nfg
=gfVJ
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/