Re: [PATCH/RFC v2] perf core: Allow setting up max frame stack depth via sysctl

From: David Ahern
Date: Fri Apr 22 2016 - 18:05:39 EST


On 4/22/16 2:52 PM, Arnaldo Carvalho de Melo wrote:
Em Wed, Apr 20, 2016 at 04:04:12PM -0700, Alexei Starovoitov escreveu:
On Wed, Apr 20, 2016 at 07:47:30PM -0300, Arnaldo Carvalho de Melo wrote:

Nice. I like it. That's a great approach to hard problem.
Java guys will be happy too.
Please also adjust two places in kernel/bpf/stackmap.c

+ {
+ .procname = "perf_event_max_stack",
+ .data = NULL, /* filled in by handler */
+ .maxlen = sizeof(sysctl_perf_event_max_stack),
+ .mode = 0644,
+ .proc_handler = perf_event_max_stack_handler,
+ .extra1 = &zero,

zero seems to be the wrong minimum. I think it should be at least 2 to
to fit user/kernel tags ? Probably needs to define max as well.

So, if someone asks for zero, it will not copy anything, but then, this
would be what the user had asked for :-)

Ditto for the max, if someone asks for too big a callchain, then when
allocating it it will fail and no callchain will be produced, that or it
will be able to allocate but will take too long copying that many
addresses, and we would be prevented from doing so by some other
protection, iirc there is perf_cpu_time_max_percent, and then buffer
space will run out.

So I think that leaving it as is is enough, no?

Can I keep your Acked-by? David, can I keep yours?

Yes


diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
index 343c22f5e867..6fe77349fa9d 100644
--- a/kernel/events/callchain.c
+++ b/kernel/events/callchain.c
@@ -18,6 +18,14 @@ struct callchain_cpus_entries {
struct perf_callchain_entry *cpu_entries[0];
};

+int sysctl_perf_event_max_stack __read_mostly = PERF_MAX_STACK_DEPTH;
+
+static size_t perf_callchain_entry__sizeof(void)
+{
+ return sizeof(struct perf_callchain_entry) +
+ sizeof(__u64) * sysctl_perf_event_max_stack;
+}
+

To Alexei's comment, a max_stack of 0 still has a non-zero alloc size so that should be ok.


static DEFINE_PER_CPU(int, callchain_recursion[PERF_NR_CONTEXTS]);
static atomic_t nr_callchain_events;
static DEFINE_MUTEX(callchain_mutex);
@@ -73,7 +81,7 @@ static int alloc_callchain_buffers(void)
if (!entries)
return -ENOMEM;

- size = sizeof(struct perf_callchain_entry) * PERF_NR_CONTEXTS;
+ size = perf_callchain_entry__sizeof() * PERF_NR_CONTEXTS;

for_each_possible_cpu(cpu) {
entries->cpu_entries[cpu] = kmalloc_node(size, GFP_KERNEL,