Re: [RFC PATCH 5/7] perf tools: Support setting different slots in a BPF map separately

From: Wangnan (F)
Date: Mon Nov 23 2015 - 00:46:42 EST




On 2015/11/23 10:01, Wangnan (F) wrote:


On 2015/11/20 23:34, Arnaldo Carvalho de Melo wrote:
Em Fri, Nov 20, 2015 at 09:25:36PM +0800, Wangnan (F) escreveu:
+ case BPF_MAP_PRIV_KEY_INDICS:
+ for (i = 0; i < priv->key.indics.nr_indics; i++) {
+ u64 _idx = priv->key.indics.indics[i];
+ unsigned int idx = (unsigned int)(_idx);
+
+ err = (*func)(name, map_fd, &def,
+ priv, &idx, arg);
+ if (err) {
+ pr_debug("ERROR: failed to insert value to %s[%u]\n",
+ name, idx);
+ return err;
+ }
+ }
This for-loop has a potential problem that, if perf's user want to
set a very big array using indices, for example:

# perf record -e
mybpf.c/maps:mymap:values[1,2,3,10-100000,200000-400000]=3/
mybpf.c/maps:mymap:values[100000-200000]=3/ ...

Perf would alloc nearly 300000 slots for indices array, consume too much
memory.

I will fix this problem by reinterprete indices array, makes negative
value represent range start and use next slot to store range size. For
example, the above perf cmdline can be converted to:

{1,2,3,-10, 99991,-200000,200001} and {-100000,100001}.
Why is that changing the way you specify what entries should be set to
a value will make it not allocate too much memory?

It is actually a problem in the next patch, in which it expand all range
into a series of indices. If user wants 1-10000, it creates an array as
[1,2,3,4,...10000], so user is possible to use a simple cmdline to consume
all of available memory.

However, the method I described above is not the best way to solve this probelm.
I thought yesterday that we should not insist on indices array. We can
make parser always return ranges. For example, [1,2,3-5] can be represent
using [(1,1), (2,1), (3,3)], so we don't need the above ugly negative
indicators.

I found the first form of representing ( start-end ) to be better than (
-start, size ), but I would use what the C language uses for expressing
ranges in switch case ranges, which is familiar and doesn't reuses the
minus arithmetic operator to express a range, i.e.:

# perf record -e \
mybpf.c/maps:mymap:values[1,2,3,10..100000,200000..400000]=3/

# perf record -e \
mybpf.c/maps:mymap:values[100000..200000]=3/ ...

'..' is better.


One problem: the case range syntax is introduced by gcc extension, not a
part of standard, and should be '...'.

Please see: https://gcc.gnu.org/onlinedocs/gcc/Case-Ranges.html

So I'll use '...' also.

Thank you.



Thank you.

- Arnaldo


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/