Re: [PATCH V5] Allow compaction of unevictable pages

From: Eric B Munson
Date: Mon Mar 16 2015 - 09:50:05 EST


On Mon, 16 Mar 2015, Vlastimil Babka wrote:

> [CC += linux-api@]
>
> Since this is a kernel-user-space API change, please CC linux-api@.
> The kernel source file Documentation/SubmitChecklist notes that all
> Linux kernel patches that change userspace interfaces should be CCed
> to linux-api@xxxxxxxxxxxxxxx, so that the various parties who are
> interested in API changes are informed. For further information, see
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.kernel.org_doc_man-2Dpages_linux-2Dapi-2Dml.html&d=AwIC-g&c=96ZbZZcaMF4w0F4jpN6LZg&r=aUmMDRRT0nx4IfILbQLv8xzE0wB9sQxTHI3QrQ2lkBU&m=GUotTNnv26L0HxtXrBgiHqu6kwW3ufx2_TQpXIA216c&s=IFFYQ7Zr-4SIaF3slOZqiSP_noyva42kCwVRxxDm5wo&e=

Added to the Cc list, thanks.

>
>
> On 03/13/2015 09:19 PM, Michal Hocko wrote:
> >On Fri 13-03-15 15:09:15, Eric B Munson wrote:
> >>On Fri, 13 Mar 2015, Rik van Riel wrote:
> >>
> >>>On 03/13/2015 01:26 PM, Eric B Munson wrote:
> >>>
> >>>>--- a/mm/compaction.c
> >>>>+++ b/mm/compaction.c
> >>>>@@ -1046,6 +1046,8 @@ typedef enum {
> >>>> ISOLATE_SUCCESS, /* Pages isolated, migrate */
> >>>> } isolate_migrate_t;
> >>>>
> >>>>+int sysctl_compact_unevictable;
>
> A comment here would be useful I think, as well as explicit default
> value. Maybe also __read_mostly although I don't know how much that
> matters.

I am going to sit on V6 for a couple of days incase anyone from rt wants
to chime in. But these will be in V6.

>
> I also wonder if it might be confusing that "compact_memory" is a
> write-only trigger that doesn't even show under "sysctl -a", while
> "compact_unevictable" is a read/write setting. But I don't have a
> better suggestion right now.

Does allow_unevictable_compaction sound better? It feels too much like
variable naming conventions from other languages which seems to
encourage verbosity to me, but does indicate a difference from
compact_memory.

>
> >>>>+
> >>>> /*
> >>>> * Isolate all pages that can be migrated from the first suitable block,
> >>>> * starting at the block pointed to by the migrate scanner pfn within
> >>>
> >>>I suspect that the use cases where users absolutely do not want
> >>>unevictable pages migrated are special cases, and it may make
> >>>sense to enable sysctl_compact_unevictable by default.
> >>
> >>Given that sysctl_compact_unevictable=0 is the way the kernel behaves
> >>now and the push back against always enabling compaction on unevictable
> >>pages, I left the default to be the behavior as it is today.
> >
> >The question is _why_ we have this behavior now. Is it intentional?
>
> It's there since 748446bb6 ("mm: compaction: memory compaction
> core"). Commit c53919adc0 ("mm: vmscan: remove lumpy reclaim")
> changes the comment in __isolate_lru_page() handling of unevictable
> pages to mention compaction explicitly. It could have been
> accidental in 748446bb6 though, maybe it just reused
> __isolate_lru_page() for compaction - it seems that the skipping of
> unevictable was initially meant to optimize lumpy reclaim.
>
> >e46a28790e59 (CMA: migrate mlocked pages) is a precedence in that
>
> Well, CMA and realtime kernels are probably mutually exclusive enough.
>
> >direction. Vlastimil has then changed that by edc2ca612496 (mm,
> >compaction: move pageblock checks up from isolate_migratepages_range()).
> >There is no mention about mlock pages so I guess it was more an
> >unintentional side effect of the patch. At least that is my current
> >understanding. I might be wrong here.
>
> Although that commit did change unintentionally more details that I
> would have liked (unfortunately), I think you are wrong on this one.
> ISOLATE_UNEVICTABLE is still passed from
> isolate_migratepages_range() which is used by CMA, while the
> compaction variant isolate_migratepages() does not pass it. So it's
> kept CMA-specific as before.
>
> >The thing about RT is that it is not usable with the upstream kernel
> >without the RT patchset AFAIU. So the default should be reflect what is
> >better for the standard kernel. RT loads have to tune the system anyway
> >so it is not so surprising they would disable this option as well. We
> >should help those guys and do not require them to touch the code but the
> >knob is reasonable IMHO.
> >
> >Especially when your changelog suggests that having this enabled by
> >default is beneficial for the standard kernel.
>
> I agree, but if there's a danger of becoming too of a bikeshed
> topic, I'm fine with keeping the default same as current behavior
> and changing it later. Or maybe we should ask some -rt mailing list
> instead of just Peter and Thomas?

According to the rt wiki, there is no -rt development list so lkml is
it. I will change the default to 1 for V6 if I don't hear otherwise by
the time I get back around to spinning V6.

Attachment: signature.asc
Description: Digital signature