I talked with Joe on my way out the door yesterday and he confirmed, just
removing -BN from our test showed a performance hit with your patch. With
the -BN option, there is no performance hit and we are perfectly fine with
your patch.
So, I guess I am confused how the -BN and your patch could change behaviour.
Just to re-iterate what we did, Joe kicked off a specJBB run and he did 20
captures of two runs (one with the unpatched binary and one with a pached
binary).
for i in {1..20}
do
time perf.unpatched mem record -a -e cpu/mem-loads,ldlat=50/pp -e cpu/mem-stores/pp sleep 10
time perf.patched mem record -a -e cpu/mem-loads,ldlat=50/pp -e cpu/mem-stores/pp sleep 10
done
then we repeat the above test but with -BN in both runs. We compare the
log sizes to make sure they are similar for the random snapshots and compare
the times. With the -BN option, the times are generally within +/- 0.5
seconds of each. Without the -BN option the patched perf binary is
generally +20-40 seconds slower.
However, based on your description above about what the -BN option does, I
am scratching my head about our results. Thoughts?