Re: [PATCH] iommu/arm-smmu-v3: Fix L1 stream table index calculation for AmpereOne

From: James Morse
Date: Wed Oct 02 2024 - 12:58:34 EST


Hello,

On 01/10/2024 21:47, Yang Shi wrote:
> On 10/1/24 12:46 PM, Jason Gunthorpe wrote:
>> On Tue, Oct 01, 2024 at 12:38:56PM -0700, Yang Shi wrote:
>>>
>>> On 10/1/24 12:18 PM, Jason Gunthorpe wrote:
>>>> On Tue, Oct 01, 2024 at 12:09:03PM -0700, Yang Shi wrote:
>>>>>> Also, there are other places doing "1 << smmu->sid_bits", e.g.
>>>>>> arm_smmu_init_strtab_linear().
>>>>> The disassembly shows it uses "sbfiz x21, x20, 6, 32" instead of lsl. 1UL
>>>>> should be used if we want to take extra caution and don't prefer rely on
>>>>> compiler.
>>>> Still, we should be fixing them all if sid_bits == 32, all those
>>>> shifts should be throwing a UBSAN error. It would be crazy to have a
>>> OK, will cover this is v2.
>> Maybe just make a little inline function to do this math and remove
>> the repated open coding? Then the types can be right, etc.
>
> Something like this?
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 01a2faee04bc..0f3aa7962117 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -3624,8 +3624,9 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
>  {
>         u32 l1size;
>         struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> +       unsigned int max_sid = arm_smmu_strtab_max_sid(smmu->sid_bits);
>         unsigned int last_sid_idx =
> -               arm_smmu_strtab_l1_idx((1UL << smmu->sid_bits) - 1);
> +               arm_smmu_strtab_l1_idx(max_sid - 1);
>
>         /* Calculate the L1 size, capped to the SIDSIZE. */
>         cfg->l2.num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES);
> @@ -3657,8 +3658,9 @@ static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
>  {
>         u32 size;
>         struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
> +       unsigned int max_sid = arm_smmu_strtab_max_sid(smmu->sid_bits);
>
> -       size = (1 << smmu->sid_bits) * sizeof(struct arm_smmu_ste);
> +       size = max_sid * sizeof(struct arm_smmu_ste);
>         cfg->linear.table = dmam_alloc_coherent(smmu->dev, size,
> &cfg->linear.ste_dma,
>                                                 GFP_KERNEL);
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> index 1e9952ca989f..de9f14293485 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
> @@ -235,6 +235,11 @@ static inline u32 arm_smmu_strtab_l2_idx(u32 sid)
>         return sid % STRTAB_NUM_L2_STES;
>  }
>
> +static inline unsigned int arm_smmu_strtab_max_sid(unsigned int sid_bits)
> +{
> +       return (1UL << sid_bits);
> +}
> +

On the same platform - this also fixes the issue:
Tested-by: James Morse <james.morse@xxxxxxx>


Thanks,

James