Re: [PATCH RFC V2 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

From: Raghavendra K T
Date: Fri Aug 28 2015 - 02:39:03 EST

On 08/28/2015 12:08 AM, David Miller wrote:
From: Raghavendra K T <raghavendra.kt@xxxxxxxxxxxxxxxxxx>
Date: Wed, 26 Aug 2015 23:07:33 +0530

@@ -4641,10 +4647,12 @@ static inline void __snmp6_fill_stats64(u64 *stats, void __percpu *mib,
static void snmp6_fill_stats(u64 *stats, struct inet6_dev *idev, int attrtype,
int bytes)
+ u64 buff[IPSTATS_MIB_MAX] = {0,};
switch (attrtype) {
- __snmp6_fill_stats64(stats, idev->stats.ipv6,

I would suggest using an explicit memset() here, it makes the overhead incurred
by this scheme clearer.

I changed the code to look like below to measure fill_stat overhead:

container creation now took: 3.012s
it was:
without patch : 6.86sec
with current patch: 3.34sec

and perf did not show the snmp6_fill_stats() parent traces.

changed code:
switch (attrtype) {
put_unaligned(IPSTATS_MIB_MAX, &stats[0]);
memset(&stats[1], 0, IPSTATS_MIB_MAX-1);

//__snmp6_fill_stats64(stats, idev->stats.ipv6, IPSTATS_MIB_MAX, bytes,
// offsetof(struct ipstats_mib, syncp), buff);

So in summary:
The current patch amounts to reduction in major overhead in fill_stat,
though there is still percpu walk overhead (0.33sec difference).

[ percpu walk overead grows when create for e.g. 3k containers].

cache miss: there was no major difference (around 1.4%) w.r.t patch

Hi David,
hope you wanted to know the overhead than to change the current patch. please let me know..

Eric, does V2 patch look good now.. please add your ack/review

time docker run -itd ubuntu:15.04 /bin/bash

real 0m3.012s
user 0m0.093s
sys 0m0.009s

# Samples: 18K of event 'cycles'
# Event count (approx.): 12838752009
# Overhead Command Shared Object Symbol
# ........ ............... ..................... ............
15.29% swapper [kernel.kallsyms] [k] snooze_loop
9.37% docker docker [.] scanblock
6.47% docker [kernel.kallsyms] [k] veth_stats_one
3.87% swapper [kernel.kallsyms] [k] _raw_spin_lock
2.71% docker docker [.]

