[PATCH v4 0/2] bugfix and optimization about CMD_SYNC
From: Zhen Lei
Date: Sun Aug 19 2018 - 03:55:29 EST
1. create a new function arm_smmu_cmdq_build_sync_msi_cmd, it's only used to
build CMD_SYNC for CS=SIG_IRQ mode.
2. In order to observe the optimization effect, I conducted 5 tests for each
case. Although the test result is volatility, but we can still get which case
is good or bad.
Test command: fio -numjobs=8 -rw=randread -runtime=30 ... -bs=4k
Test Result: IOPS
Case 1: (without these patches)
Case 2: (only apply the variant of patch 1, move arm_smmu_cmdq_build_cmd into lock)
Case 3: (only apply patch 1)
Case 4: (apply both patch 1 and patch 2)
v2 -> v3:
Although I have no data to show how many performance will be impacted
because of arm_smmu_cmdq_build_cmd is protected by spinlock. But it's
clear that the performance is bound to drop, a memset operation and
a complicate switch..case in the function arm_smmu_cmdq_build_cmd.
v1 -> v2:
1. move the call to arm_smmu_cmdq_build_cmd into the critical section,
and keep itself unchange.
2. Although patch2 can make sure no two CMD_SYNCs will be adjacent,
but patch1 is still needed, see below:
cpu0 cpu1 cpu2
insert a TLBI command
smmu execute cmd1
smmu execute TLBI
smmu execute cmd0
poll timeout, because msidata=1 is overridden by
cmd0, that means VAL=0, sync_idx=1.
Zhen Lei (2):
iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout
iommu/arm-smmu-v3: avoid redundant CMD_SYNCs if possible
drivers/iommu/arm-smmu-v3.c | 44 ++++++++++++++++++++++++++++++++------------
1 file changed, 32 insertions(+), 12 deletions(-)