perf numbers (was: Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue)

From: Borislav Petkov
Date: Sun Apr 26 2015 - 07:22:27 EST


On Sat, Apr 25, 2015 at 11:12:06PM +0200, Borislav Petkov wrote:
> I've prepended the perf stat output with markers A:, B: or C: for easier
> comparing. The markers mean:
>
> A: Linus' master from a couple of days ago + tip/master + tip/x86/asm
> B: With Andy's SYSRET patch ontop
> C: Without RCX canonicalness check (see patch at the end).

What was missing is D = B+C results, here they are:

A: 2835570.145246 cpu-clock (msec) ( +- 0.02% ) [100.00%]
B: 2833364.074970 cpu-clock (msec) ( +- 0.04% ) [100.00%]
C: 2834708.335431 cpu-clock (msec) ( +- 0.02% ) [100.00%]
D: 2835055.118431 cpu-clock (msec) ( +- 0.01% ) [100.00%]

A: 2835570.099981 task-clock (msec) # 3.996 CPUs utilized ( +- 0.02% ) [100.00%]
B: 2833364.073633 task-clock (msec) # 3.996 CPUs utilized ( +- 0.04% ) [100.00%]
C: 2834708.350387 task-clock (msec) # 3.996 CPUs utilized ( +- 0.02% ) [100.00%]
D: 2835055.094383 task-clock (msec) # 3.996 CPUs utilized ( +- 0.01% ) [100.00%]

A: 5,591,213,166,613 cycles # 1.972 GHz ( +- 0.03% ) [75.00%]
B: 5,585,023,802,888 cycles # 1.971 GHz ( +- 0.03% ) [75.00%]
C: 5,587,983,212,758 cycles # 1.971 GHz ( +- 0.02% ) [75.00%]
D: 5,584,838,532,936 cycles # 1.970 GHz ( +- 0.03% ) [75.00%]

A: 3,106,707,101,530 instructions # 0.56 insns per cycle ( +- 0.01% ) [75.00%]
B: 3,106,632,251,528 instructions # 0.56 insns per cycle ( +- 0.00% ) [75.00%]
C: 3,106,265,958,142 instructions # 0.56 insns per cycle ( +- 0.00% ) [75.00%]
D: 3,106,294,801,185 instructions # 0.56 insns per cycle ( +- 0.00% ) [75.00%]

A: 683,676,044,429 branches # 241.107 M/sec ( +- 0.01% ) [75.00%]
B: 683,670,899,595 branches # 241.293 M/sec ( +- 0.01% ) [75.00%]
C: 683,675,772,858 branches # 241.180 M/sec ( +- 0.01% ) [75.00%]
D: 683,683,533,664 branches # 241.154 M/sec ( +- 0.00% ) [75.00%]

A: 43,829,535,008 branch-misses # 6.41% of all branches ( +- 0.02% ) [75.00%]
B: 43,844,118,416 branch-misses # 6.41% of all branches ( +- 0.03% ) [75.00%]
C: 43,819,871,086 branch-misses # 6.41% of all branches ( +- 0.02% ) [75.00%]
D: 43,795,107,998 branch-misses # 6.41% of all branches ( +- 0.02% ) [75.00%]

A: 2,030,357 context-switches # 0.716 K/sec ( +- 0.06% ) [100.00%]
B: 2,029,313 context-switches # 0.716 K/sec ( +- 0.05% ) [100.00%]
C: 2,028,566 context-switches # 0.716 K/sec ( +- 0.06% ) [100.00%]
D: 2,028,895 context-switches # 0.716 K/sec ( +- 0.06% ) [100.00%]

A: 52,421 migrations # 0.018 K/sec ( +- 1.13% )
B: 52,049 migrations # 0.018 K/sec ( +- 1.02% )
C: 51,365 migrations # 0.018 K/sec ( +- 0.92% )
D: 51,766 migrations # 0.018 K/sec ( +- 1.11% )

A: 709.528485252 seconds time elapsed ( +- 0.02% )
B: 708.976557288 seconds time elapsed ( +- 0.04% )
C: 709.312844791 seconds time elapsed ( +- 0.02% )
D: 709.400050112 seconds time elapsed ( +- 0.01% )

So in all events except "branches" - which is comprehensible, btw -
we have a minimal net win when looking at how the numbers in A have
improved in D *with* *both* patches applied.

And those numbers are pretty nice IMO - even if the net win is
measurement artefact and not really a win, we certainly don't have a net
loss so unless anyone objects, I'm going to apply both patches but wait
for Andy's v2 with better comments and changed ss_sel test as per Denys'
suggestion.

Thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/