Re: [PATCH RFC V4 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

From: Eric Dumazet
Date: Sun Aug 30 2015 - 11:52:44 EST


On Sun, 2015-08-30 at 11:29 +0530, Raghavendra K T wrote:
> Docker container creation linearly increased from around 1.6 sec to 7.5 sec
> (at 1000 containers) and perf data showed 50% ovehead in snmp_fold_field.
>
> reason: currently __snmp6_fill_stats64 calls snmp_fold_field that walks
> through per cpu data of an item (iteratively for around 36 items).
>
> idea: This patch tries to aggregate the statistics by going through
> all the items of each cpu sequentially which is reducing cache
> misses.
>
> Docker creation got faster by more than 2x after the patch.
>
> Result:
> Before After
> Docker creation time 6.836s 3.25s
> cache miss 2.7% 1.41%
>
> perf before:
> 50.73% docker [kernel.kallsyms] [k] snmp_fold_field
> 9.07% swapper [kernel.kallsyms] [k] snooze_loop
> 3.49% docker [kernel.kallsyms] [k] veth_stats_one
> 2.85% swapper [kernel.kallsyms] [k] _raw_spin_lock
>
> perf after:
> 10.57% docker docker [.] scanblock
> 8.37% swapper [kernel.kallsyms] [k] snooze_loop
> 6.91% docker [kernel.kallsyms] [k] snmp_get_cpu_field
> 6.67% docker [kernel.kallsyms] [k] veth_stats_one
>
> changes/ideas suggested:
> Using buffer in stack (Eric), Usage of memset (David), Using memcpy in
> place of unaligned_put (Joe).
>
> Signed-off-by: Raghavendra K T <raghavendra.kt@xxxxxxxxxxxxxxxxxx>
> ---

Acked-by: Eric Dumazet <edumazet@xxxxxxxxxx>

Thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/