Re: [PATCH RFC V2 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

From: Eric Dumazet
Date: Fri Aug 28 2015 - 20:06:37 EST


On Fri, 2015-08-28 at 16:12 -0700, Joe Perches wrote:

> Generally true. It's always difficult to know how much
> stack has been consumed though and smaller stack frames
> are generally better.

Calling kmalloc(288, GFP_KERNEL) might use way more than 288 bytes in
kernel stack on 64 bit arch.

__slab_alloc() itself for example uses 208 bytes on stack, so add all
others, and you might go above 500 bytes.

So for a _leaf_ function, it is better to declare an automatic variable,
as you in fact reduce max stack depth.

Not only it uses less kernel stack, it is also way faster, as you avoid
kmalloc()/kfree() overhead and reuse probably already hot cache lines in
kernel stack.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/