Re: [PATCH] x86/events/amd/iommu: Fix invalid Perf result due to IOMMU PMC power-gating

From: David Coe
Date: Tue May 04 2021 - 13:04:38 EST


Hi again!

On 04/05/2021 07:52, Suravee Suthikulpanit wrote:
On certain AMD platforms, when the IOMMU performance counter source
(csource) field is zero, power-gating for the counter is enabled, which
prevents write access and returns zero for read access.

This can cause invalid perf result especially when event multiplexing
is needed (i.e. more number of events than available counters) since
the current logic keeps track of the previously read counter value,
and subsequently re-program the counter to continue counting the event.
With power-gating enabled, we cannot gurantee successful re-programming
of the counter.

Workaround this issue by :

1. Modifying the ordering of setting/reading counters and enabing/
disabling csources to only access the counter when the csource
is set to non-zero.

2. Since AMD IOMMU PMU does not support interrupt mode, the logic
can be simplified to always start counting with value zero,
and accumulate the counter value when stopping without the need
to keep track and reprogram the counter with the previously read
counter value.


Results for Ryzen 4700U running Ubuntu 21.04 kernel 5.11.0-16 patched as above.

All amd_iommu events:

Performance counter stats for 'system wide':

18 amd_iommu_0/cmd_processed/ (33.29%)
9 amd_iommu_0/cmd_processed_inv/ (33.33%)
0 amd_iommu_0/ign_rd_wr_mmio_1ff8h/ (33.36%)
308 amd_iommu_0/int_dte_hit/ (33.40%)
5 amd_iommu_0/int_dte_mis/ (33.45%)
346 amd_iommu_0/mem_dte_hit/ (33.46%)
8,954 amd_iommu_0/mem_dte_mis/ (33.48%)
0 amd_iommu_0/mem_iommu_tlb_pde_hit/ (33.46%)
771 amd_iommu_0/mem_iommu_tlb_pde_mis/ (33.44%)
14 amd_iommu_0/mem_iommu_tlb_pte_hit/ (33.40%)
836 amd_iommu_0/mem_iommu_tlb_pte_mis/ (33.36%)
0 amd_iommu_0/mem_pass_excl/ (33.32%)
0 amd_iommu_0/mem_pass_pretrans/ (33.28%)
1,601 amd_iommu_0/mem_pass_untrans/ (33.27%)
0 amd_iommu_0/mem_target_abort/ (33.27%)
1,130 amd_iommu_0/mem_trans_total/ (33.27%)
0 amd_iommu_0/page_tbl_read_gst/ (33.27%)
312 amd_iommu_0/page_tbl_read_nst/ (33.28%)
279 amd_iommu_0/page_tbl_read_tot/ (33.27%)
0 amd_iommu_0/smi_blk/ (33.29%)
0 amd_iommu_0/smi_recv/ (33.27%)
0 amd_iommu_0/tlb_inv/ (33.26%)
0 amd_iommu_0/vapic_int_guest/ (33.25%)
366 amd_iommu_0/vapic_int_non_guest/ (33.27%)

10.001941666 seconds time elapsed


Groups of 8 amd_iommu events:

Performance counter stats for 'system wide':

14 amd_iommu_0/cmd_processed/
7 amd_iommu_0/cmd_processed_inv/
0 amd_iommu_0/ign_rd_wr_mmio_1ff8h/
502 amd_iommu_0/int_dte_hit/
6 amd_iommu_0/int_dte_mis/
532 amd_iommu_0/mem_dte_hit/
13,622 amd_iommu_0/mem_dte_mis/
159 amd_iommu_0/mem_iommu_tlb_pde_hit/

10.002170562 seconds time elapsed


Performance counter stats for 'system wide':

762 amd_iommu_0/mem_iommu_tlb_pde_mis/
20 amd_iommu_0/mem_iommu_tlb_pte_hit/
698 amd_iommu_0/mem_iommu_tlb_pte_mis/
0 amd_iommu_0/mem_pass_excl/
0 amd_iommu_0/mem_pass_pretrans/
15 amd_iommu_0/mem_pass_untrans/
0 amd_iommu_0/mem_target_abort/
718 amd_iommu_0/mem_trans_total/

10.001683428 seconds time elapsed


Performance counter stats for 'system wide':

0 amd_iommu_0/page_tbl_read_gst/
33 amd_iommu_0/page_tbl_read_nst/
33 amd_iommu_0/page_tbl_read_tot/
0 amd_iommu_0/smi_blk/
0 amd_iommu_0/smi_recv/
0 amd_iommu_0/tlb_inv/
0 amd_iommu_0/vapic_int_guest/
11,638 amd_iommu_0/vapic_int_non_guest/

10.002205748 seconds time elapsed