Re: [PATCH -mm] vmscan: make mapped executable pages the first class citizen

From: KOSAKI Motohiro
Date: Sun May 10 2009 - 05:29:58 EST


>> >> The patch seems reasonable but the changelog and the (non-existent)
>> >> design documentation could do with a touch-up.
>> >
>> > Is it right that I as a user can do things like mmap my database
>> > PROT_EXEC to get better database numbers by making other
>> > stuff swap first ?
>> >
>> > You seem to be giving everyone a "nice my process up" hack.
>>
>> How about this?
>
> Why it deserves more tricks? PROT_EXEC pages are rare.
> If user space is to abuse PROT_EXEC, let them be for it ;-)

yes, typicall rare.
tha problem is, user program _can_ use PROT_EXEC for get higher priority
ahthough non-executable memory.

In general, static priority mechanism have one weakness. if all object
have higher
priority, it break priority mechanism.


>> if priority < DEF_PRIORITY-2, aggressive lumpy reclaim in
>> shrink_inactive_list() already
>> reclaim the active page forcely.
>
> Isn't lumpy reclaim now enabled by (and only by) non-zero order?

you are right. but I only say the kernel already have policy changing threashold
for preventing worst case.


>> then, this patch don't change kernel reclaim policy.
>>
>> anyway, user process non-changable preventing "nice my process up
>> hack" seems makes sense to me.
>>
>> test result:
>>
>> echo 100 > /proc/sys/vm/dirty_ratio
>> echo 100 > /proc/sys/vm/dirty_background_ratio
>> run modified qsbench (use mmap(PROT_EXEC) instead malloc)
>>
>>            active2active vs active2inactive ratio
>> before    5:5
>> after       1:9
>
> Do you have scripts for producing such numbers? I'm dreaming to have
> such tools :-)

I made stastics showing patch for testing, hehe :)

---
include/linux/vmstat.h | 1 +
mm/vmstat.c | 1 +
2 files changed, 2 insertions(+)

Index: b/include/linux/vmstat.h
===================================================================
--- a/include/linux/vmstat.h 2009-02-17 07:34:38.000000000 +0900
+++ b/include/linux/vmstat.h 2009-05-10 02:36:37.000000000 +0900
@@ -51,6 +51,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PS
UNEVICTABLE_PGSTRANDED, /* unable to isolate on unlock */
UNEVICTABLE_MLOCKFREED,
#endif
+ FOR_ALL_ZONES(PGA2A),
NR_VM_EVENT_ITEMS
};

Index: b/mm/vmstat.c
===================================================================
--- a/mm/vmstat.c 2009-05-10 01:08:36.000000000 +0900
+++ b/mm/vmstat.c 2009-05-10 02:37:18.000000000 +0900
@@ -708,6 +708,7 @@ static const char * const vmstat_text[]
"unevictable_pgs_stranded",
"unevictable_pgs_mlockfreed",
#endif
+ TEXTS_FOR_ZONES("pga2a")
#endif
};



>> please don't ask performance number. I haven't reproduce Wu's patch
>> improvemnt ;)
>
> That's why I decided to "explain" instead of "benchmark" the benefits
> of my patch, hehe.

okey, I see.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/