[34-longterm 007/196] perf: Better fit max unprivileged mlock pages for tools needs

From: Paul Gortmaker
Date: Mon Mar 12 2012 - 20:17:38 EST


From: Frederic Weisbecker <fweisbec@xxxxxxxxx>

-------------------
This is a commit scheduled for the next v2.6.34 longterm release.
If you see a problem with using this for longterm, please comment.
-------------------

commit 880f57318450dbead6a03f9e31a1468924d6dd88 upstream.

The maximum kilobytes of locked memory that an unprivileged user
can reserve is of 512 kB = 128 pages by default, scaled to the
number of onlined CPUs, which fits well with the tools that use
128 data pages by default.

However tools actually use 129 pages, because they need one more
for the user control page. Thus the default mlock threshold is
not sufficient for the default tools needs and we always end up
to evaluate the constant mlock rlimit policy, which doesn't have
this scaling with the number of online CPUs.

Hence, on systems that have more than 16 CPUs, we overlap the
rlimit threshold and fail to mmap:

$ perf record ls
Error: failed to mmap with 1 (Operation not permitted)

Just increase the max unprivileged mlock threshold by one page
so that it supports well perf tools even after 16 CPUs.

Reported-by: Han Pingtian <phan@xxxxxxxxxx>
Reported-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Reported-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Signed-off-by: Frederic Weisbecker <fweisbec@xxxxxxxxx>
Acked-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: Stephane Eranian <eranian@xxxxxxxxxx>
LKML-Reference: <1300904979-5508-1-git-send-email-fweisbec@xxxxxxxxx>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
Signed-off-by: Paul Gortmaker <paul.gortmaker@xxxxxxxxxxxxx>
---
kernel/perf_event.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 2357b19..b203546 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -57,7 +57,8 @@ static atomic_t nr_task_events __read_mostly;
*/
int sysctl_perf_event_paranoid __read_mostly = 1;

-int sysctl_perf_event_mlock __read_mostly = 512; /* 'free' kb per user */
+/* Minimum for 128 pages + 1 for the user control page */
+int sysctl_perf_event_mlock __read_mostly = 516; /* 'free' kb per user */

/*
* max perf event sample rate
--
1.7.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/