On Fri, 2008-02-15 at 07:05 +0100, Eric Dumazet wrote:
Zhang, Yanmin a ïcrit :64bit x86-64 machine. On another 4-way Madison Itanium machine, tbench has the
Comparing with kernel 2.6.24, tbench result has regression withYup. struct dst is sensitive to alignements, especially for benches.
2.6.25-rc1.
1) On 2 quad-core processor stoakley: 4%.
2) On 4 quad-core processor tigerton: more than 30%.
bisect located below patch.
b4ce92775c2e7ff9cf79cca4e0a19c8c5fd6287b is first bad commit
commit b4ce92775c2e7ff9cf79cca4e0a19c8c5fd6287b
Author: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Date: Tue Nov 13 21:33:32 2007 -0800
[IPV6]: Move nfheader_len into rt6_info
The dst member nfheader_len is only used by IPv6. It's also currently
creating a rather ugly alignment hole in struct dst. Therefore this patch
moves it from there into struct rt6_info.
As tbench uses ipv4, so the patch's real impact on ipv4 is it deletes
nfheader_len in dst_entry. It might change cache line alignment.
To verify my finding, I just added nfheader_len back to dst_entry in 2.6.25-rc1
and reran tbench on the 2 machines. Performance could be recovered completely.
I started cpu_number*2 tbench processes. On my 16-core tigerton:
#./tbench_srv &
#./tbench 32 127.0.0.1
-yanmin
In the real world, we need to make sure that next pointer start at a cache line bondary (or a litle bit after), so that RT cache lookups use one cache line per entry instead of two. This permits better behavior in DDOS attacks.
(check commit 1e19e02ca0c5e33ea73a25127dbe6c3b8fcaac4b for reference)
Are you using a 64 or a 32 bit kernel ?
similiar regression.