Re: [PATCH] mm: page_alloc: High-order per-cpu page allocator v7

From: Eric Dumazet
Date: Wed Dec 07 2016 - 15:11:31 EST


On Wed, 2016-12-07 at 19:48 +0000, Mel Gorman wrote:
>
>
> Interesting because it didn't match what I previous measured but then
> again, when I established that netperf on localhost was slab intensive,
> it was also an older kernel. Can you tell me if SLAB or SLUB was enabled
> in your test kernel?
>
> Either that or the baseline I used has since been changed from what you
> are testing and we're not hitting the same paths.


lpaa6:~# uname -a
Linux lpaa6 4.9.0-smp-DEV #429 SMP @1481125332 x86_64 GNU/Linux

lpaa6:~# perf record -g ./netperf -t UDP_STREAM -l 3 -- -m 16384
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
localhost () port 0 AF_INET
Socket Message Elapsed Messages
Size Size Time Okay Errors Throughput
bytes bytes secs # # 10^6bits/sec

212992 16384 3.00 654644 0 28601.04
212992 3.00 654592 28598.77

[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 1.888 MB perf.data (~82481 samples) ]


perf report --stdio
...
1.92% netperf [kernel.kallsyms] [k]
cache_alloc_refill
|
--- cache_alloc_refill
|
|--82.22%-- kmem_cache_alloc_node_trace
| __kmalloc_node_track_caller
| __alloc_skb
| alloc_skb_with_frags
| sock_alloc_send_pskb
| sock_alloc_send_skb
| __ip_append_data.isra.50
| ip_make_skb
| udp_sendmsg
| inet_sendmsg
| sock_sendmsg
| SYSC_sendto
| sys_sendto
| entry_SYSCALL_64_fastpath
| __sendto_nocancel
| |
| --100.00%-- 0x0
|


Oh wait, sock_alloc_send_skb() requests for all the bytes in skb->head :

struct sk_buff *sock_alloc_send_skb(struct sock *sk, unsigned long size,
int noblock, int *errcode)
{
return sock_alloc_send_pskb(sk, size, 0, noblock, errcode, 0);
}


Maybe one day we will avoid doing order-4 (or even order-5 in extreme
cases !) allocations for loopback as we did for af_unix :P

I mean, maybe some applications are sending 64KB UDP messages over
loopback right now...