Re: [PATCHv5 07/28] thp, mlock: do not allow huge pages in mlocked area

From: Kirill A. Shutemov
Date: Wed May 20 2015 - 08:11:25 EST


On Tue, May 19, 2015 at 04:37:25PM +0200, Vlastimil Babka wrote:
> On 05/15/2015 03:41 PM, Kirill A. Shutemov wrote:
> >On Fri, May 15, 2015 at 02:56:42PM +0200, Vlastimil Babka wrote:
> >>On 04/23/2015 11:03 PM, Kirill A. Shutemov wrote:
> >>>With new refcounting THP can belong to several VMAs. This makes tricky
> >>>to track THP pages, when they partially mlocked. It can lead to leaking
> >>>mlocked pages to non-VM_LOCKED vmas and other problems.
> >>>With this patch we will split all pages on mlock and avoid
> >>>fault-in/collapse new THP in VM_LOCKED vmas.
> >>>
> >>>I've tried alternative approach: do not mark THP pages mlocked and keep
> >>>them on normal LRUs. This way vmscan could try to split huge pages on
> >>>memory pressure and free up subpages which doesn't belong to VM_LOCKED
> >>>vmas. But this is user-visible change: we screw up Mlocked accouting
> >>>reported in meminfo, so I had to leave this approach aside.
> >>>
> >>>We can bring something better later, but this should be good enough for
> >>>now.
> >>
> >>I can imagine people won't be happy about losing benefits of THP's when they
> >>mlock().
> >>How difficult would it be to support mlocked THP pages without splitting
> >>until something actually tries to do a partial (un)mapping, and only then do
> >>the split? That will support the most common case, no?
> >
> >Yes, it will.
> >
> >But what will we do if we fail to split huge page on munmap()? Fail
> >munmap() with -EBUSY?
>
> We could just unmlock the whole THP page and if we could make the deferred
> split done ASAP, and not waiting for memory pressure, the window with
> NR_MLOCK being undercounted would be minimized. Since the RLIMIT_MEMLOCK is
> tracked independently from NR_MLOCK, there should be no danger wrt breaching
> the limit due to undercounting here?

I'm not sure what "ASAP" should mean here and how to implement it.

I would really prefer to address mlock separately. The patchset is already
huge enough. :-/

--
Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/