Hi Jin,
On 2/13/20 12:45 PM, Jin Yao wrote:
With this patch, for example,
 # perf stat -e cpu/event=cpu-cycles,percore/ -a -A --percore-show-thread -- sleep 1
ÂÂ Performance counter stats for 'system wide':
 CPU0 2,453,061 cpu/event=cpu-cycles,percore/
 CPU1 1,823,921 cpu/event=cpu-cycles,percore/
 CPU2 1,383,166 cpu/event=cpu-cycles,percore/
 CPU3 1,102,652 cpu/event=cpu-cycles,percore/
 CPU4 2,453,061 cpu/event=cpu-cycles,percore/
 CPU5 1,823,921 cpu/event=cpu-cycles,percore/
 CPU6 1,383,166 cpu/event=cpu-cycles,percore/
 CPU7 1,102,652 cpu/event=cpu-cycles,percore/
We can see counts are duplicated in CPU pairs
(CPU0/CPU4, CPU1/CPU5, CPU2/CPU6, CPU3/CPU7).
I was trying this patch and I am getting bit weird results when any cpu
is offline. Ex,
 $ lscpu | grep list
 On-line CPU(s) list: 0-4,6,7
 Off-line CPU(s) list: 5
 $ sudo ./perf stat -e cpu/event=cpu-cycles,percore/ -a -A --percore-show-thread -vv -- sleep 1
ÂÂÂ ...
 cpu/event=cpu-cycles,percore/: 0: 23746491 1001189836 1001189836
 cpu/event=cpu-cycles,percore/: 1: 19802666 1001291299 1001291299
 cpu/event=cpu-cycles,percore/: 2: 24211983 1001394318 1001394318
 cpu/event=cpu-cycles,percore/: 3: 54051396 1001516816 1001516816
 cpu/event=cpu-cycles,percore/: 4: 6378825 1001064048 1001064048
 cpu/event=cpu-cycles,percore/: 5: 21299840 1001166297 1001166297
 cpu/event=cpu-cycles,percore/: 6: 13075410 1001274535 1001274535
ÂÂ Performance counter stats for 'system wide':
 CPU0 30,125,316 cpu/event=cpu-cycles,percore/
 CPU1 19,802,666 cpu/event=cpu-cycles,percore/
 CPU2 45,511,823 cpu/event=cpu-cycles,percore/
 CPU3 67,126,806 cpu/event=cpu-cycles,percore/
 CPU4 30,125,316 cpu/event=cpu-cycles,percore/
 CPU7 67,126,806 cpu/event=cpu-cycles,percore/
 CPU0 30,125,316 cpu/event=cpu-cycles,percore/
ÂÂÂÂÂÂÂÂ 1.001918764 seconds time elapsed
I see proper result without --percore-show-thread:
 $ sudo ./perf stat -e cpu/event=cpu-cycles,percore/ -a -A -vv -- sleep 1
ÂÂÂ ...
 cpu/event=cpu-cycles,percore/: 0: 11676414 1001190709 1001190709
 cpu/event=cpu-cycles,percore/: 1: 39119617 1001291459 1001291459
 cpu/event=cpu-cycles,percore/: 2: 41821512 1001391158 1001391158
 cpu/event=cpu-cycles,percore/: 3: 46853730 1001492799 1001492799
 cpu/event=cpu-cycles,percore/: 4: 14448274 1001095948 1001095948
 cpu/event=cpu-cycles,percore/: 5: 42238217 1001191187 1001191187
 cpu/event=cpu-cycles,percore/: 6: 33129641 1001292072 1001292072
ÂÂ Performance counter stats for 'system wide':
 S0-D0-C0 26,124,688 cpu/event=cpu-cycles,percore/
 S0-D0-C1 39,119,617 cpu/event=cpu-cycles,percore/
 S0-D0-C2 84,059,729 cpu/event=cpu-cycles,percore/
 S0-D0-C3 79,983,371 cpu/event=cpu-cycles,percore/
ÂÂÂÂÂÂÂÂ 1.001961563 seconds time elapsed
[...]
+--percore-show-thread::
+The event modifier "percore" has supported to sum up the event counts
+for all hardware threads in a core and show the counts per core.
+
+This option with event modifier "percore" enabled also sums up the event
+counts for all hardware threads in a core but show the sum counts per
+hardware thread. This is essentially a replacement for the any bit and
+convenient for posting process.
s/posting process/post processing/ ? :)
Ravi