Re: Performance/Memory usage patch

David S. Miller (davem@jenolan.rutgers.edu)
Thu, 10 Apr 1997 02:04:07 -0400


Date: Wed, 9 Apr 1997 10:29:05 +0100 (BST)
From: Mark Hemment <markhe@nextd.demon.co.uk>

The figures under "lmbench" show limited/no improvement, but
running other tests (such as a large number of kernel compiles) do
show a performance increase (particularly on a 486 box).

I believe networking throughput/latencies take a small performance
hit. This is probably due to different byte alignments of
dynamically allocated memory. I do not understand the networking
code - perhaps somebody could try SLABising it (making sure common
accessed members are on the same h/w cache line, and using
kmem_find_general_cachep()).

I think I know what is behind all this. I see in your patch that you
undo one of the arguments to the creation of the vm_area_struct SLAB
cache, I put that there _specifically_ because if I did not the
machine ran slow as balls. Investigation showed that I was _not_
getting 8-byte alignment to the structure which is necessary to
prevent _all_ accesses to the loff_t member from invoking unalignment
traps on the Sparc.

You will see this as an issue on any modern architecture.

Kmalloc (sort of) guarenteed for the most part 8-byte alignment for
blocks of memory it returned, at a minimum SLAB and things built upon
it should do the same for the very reason described above.

I believe this can account for the networking slowdowns, and in
particular it may explain why the lmbench numbers were about the same
otherwise ;-)

If you dealt with this issue in your patch set (I've looked it over a
few times but have not played actively with it yet) I apologize.

---------------------------------------------////
Yow! 11.26 MB/s remote host TCP bandwidth & ////
199 usec remote TCP latency over 100Mb/s ////
ethernet. Beat that! ////
-----------------------------------------////__________ o
David S. Miller, davem@caip.rutgers.edu /_____________/ / // /_/ ><