Re: [GIT PULL] perf x86 updates for v3.20

From: Andy Lutomirski
Date: Mon Feb 16 2015 - 15:55:35 EST

On 02/15/2015 11:48 PM, Ingo Molnar wrote:

Please pull the latest perf-core-for-linus git tree from:

git:// perf-core-for-linus

# HEAD: a66734297f78707ce39d756b656bfae861d53f62 perf/x86: Add /sys/devices/cpu/rdpmc=2 to allow rdpmc for all tasks


The extra CR4 manipulation adds ~ <50ns to the context
switch cost between rdpmc-capable and rdpmc-non-capable

That's about the best I could benchmark, too -- if it was more than about 50ns, I'm pretty sure I wouldn't seen a difference, but, as it stands, it seems to have been lost in the noise. Maybe I should find a better benchmark.

In any event, this series is probably a mixed bag performance-wise. In the best base, there's a small extra cost in context switches, and, when switching PCE, there's a CR4 write. On SVM guests, the CR4 write will suck.

To balance that out, I removed a CR4 read from VMX entry and from global TLB flushes. The former mostly fixes a performance regression from a security fix a few releases back, and the I expect that the latter will more than offset the added context switch overhead (especially on SVM guests, where even CR4 reads exit AFAIK).

Anyway, I tried and failed to detect any difference at all. Context switch timing was very noisy for me.

