Re: [PATCHv5 07/28] thp, mlock: do not allow huge pages in mlocked area

From: Vlastimil Babka
Date: Tue May 19 2015 - 10:37:38 EST


On 05/15/2015 03:41 PM, Kirill A. Shutemov wrote:
On Fri, May 15, 2015 at 02:56:42PM +0200, Vlastimil Babka wrote:
On 04/23/2015 11:03 PM, Kirill A. Shutemov wrote:
With new refcounting THP can belong to several VMAs. This makes tricky
to track THP pages, when they partially mlocked. It can lead to leaking
mlocked pages to non-VM_LOCKED vmas and other problems.
With this patch we will split all pages on mlock and avoid
fault-in/collapse new THP in VM_LOCKED vmas.

I've tried alternative approach: do not mark THP pages mlocked and keep
them on normal LRUs. This way vmscan could try to split huge pages on
memory pressure and free up subpages which doesn't belong to VM_LOCKED
vmas. But this is user-visible change: we screw up Mlocked accouting
reported in meminfo, so I had to leave this approach aside.

We can bring something better later, but this should be good enough for
now.

I can imagine people won't be happy about losing benefits of THP's when they
mlock().
How difficult would it be to support mlocked THP pages without splitting
until something actually tries to do a partial (un)mapping, and only then do
the split? That will support the most common case, no?

Yes, it will.

But what will we do if we fail to split huge page on munmap()? Fail
munmap() with -EBUSY?

We could just unmlock the whole THP page and if we could make the deferred split done ASAP, and not waiting for memory pressure, the window with NR_MLOCK being undercounted would be minimized. Since the RLIMIT_MEMLOCK is tracked independently from NR_MLOCK, there should be no danger wrt breaching the limit due to undercounting here?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/