sk_prot->memory_allocated points to global atomic variable:
atomic_long_t tcp_memory_allocated ____cacheline_aligned_in_smp;
If increasing the per-cpu cache size from 1MB to e.g. 16MB,
changes to sk->sk_prot->memory_allocated can be further reduced.
Performance may be improved on system with many cores.
This looks good, do you have any performance numbers to share ?
On a host with 384 threads, 384*16 -> 6 GB of memory.
With this kind of use, we might need a shrinker...