x86 perf's dTLB-load-misses broken on IvyBridge?
From: Dave Hansen
Date: Tue Feb 18 2014 - 18:12:20 EST
I noticed that perf's dTLB-load-misses even t isn't working on my
Ivybridge system:
> Performance counter stats for 'system wide':
>
> 0 dTLB-load-misses [100.00%]
> 48,570 dTLB-store-misses [100.00%]
> 202,573 iTLB-loads [100.00%]
> 271,546 iTLB-load-misses # 134.05% of all iTLB cache hits
But it works on a SandyBridge system that I have.
arch/x86/kernel/cpu/perf_event_intel.c seems to use the same tables for
SandyBridge and IvyBridge, so they both use the
'MEM_UOP_RETIRED.ALL_LOADS' event:
> [ C(DTLB) ] = {
> [ C(OP_READ) ] = {
> [ C(RESULT_ACCESS) ] = 0x81d0, /* MEM_UOP_RETIRED.ALL_LOADS */
> [ C(RESULT_MISS) ] = 0x0108, /* DTLB_LOAD_MISSES.CAUSES_A_WALK */
> },
But that event looks to be unsupported on this CPU:
> /ocperf.py stat -a -e mem_uops_retired.all_loads sleep 1
> perf stat -a -e cpu/event=0xd0,umask=0x81,name=mem_uops_retired_all_loads/ sleep 1
>
> Performance counter stats for 'system wide':
>
> <not supported> mem_uops_retired_all_loads
> 50,204,763 mem_uops_retired_all_loads_ps
But there's a "_ps" version which uses PEBS which does work?
> mem_uops_retired.all_loads [Load uops retired to architected path with filter on bits 0 and 1 applied. (Supports PEBS)]
> mem_uops_retired.all_loads_ps [Load uops retired to architected path with filter on bits 0 and 1 applied. (Uses PEBS) (Uses PEBS)]
Should we swap perf_event_intel.c over to use the PEBS version so that
it works everywhere?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/