Re: [PATCH v1 01/10] powerpc/pseries/iommu: Replace hard-coded page shift
From: Alexey Kardashevskiy
Date: Thu Aug 27 2020 - 22:27:43 EST
On 28/08/2020 01:32, Leonardo Bras wrote:
> Hello Alexey, thank you for this feedback!
>
> On Sat, 2020-08-22 at 19:33 +1000, Alexey Kardashevskiy wrote:
>>> +#define TCE_RPN_BITS 52 /* Bits 0-51 represent RPN on TCE */
>>
>> Ditch this one and use MAX_PHYSMEM_BITS instead? I am pretty sure this
>> is the actual limit.
>
> I understand this MAX_PHYSMEM_BITS(51) comes from the maximum physical memory addressable in the machine. IIUC, it means we can access physical address up to (1ul << MAX_PHYSMEM_BITS).
>
> This 52 comes from PAPR "Table 9. TCE Definition" which defines bits
> 0-51 as the RPN. By looking at code, I understand that it means we may input any address < (1ul << 52) to TCE.
>
> In practice, MAX_PHYSMEM_BITS should be enough as of today, because I suppose we can't ever pass a physical page address over
> (1ul << 51), and TCE accepts up to (1ul << 52).
> But if we ever increase MAX_PHYSMEM_BITS, it doesn't necessarily means that TCE_RPN_BITS will also be increased, so I think they are independent values.
>
> Does it make sense? Please let me know if I am missing something.
The underlying hardware is PHB3/4 about which the IODA2 Version 2.4
6Apr2012.pdf spec says:
"The number of most significant RPN bits implemented in the TCE is
dependent on the max size of System Memory to be supported by the platform".
IODA3 is the same on this matter.
This is MAX_PHYSMEM_BITS and PHB itself does not have any other limits
on top of that. So the only real limit comes from MAX_PHYSMEM_BITS and
where TCE_RPN_BITS comes from exactly - I have no idea.
>
>>
>>
>>> +#define TCE_RPN_MASK(ps) ((1ul << (TCE_RPN_BITS - (ps))) - 1)
>>> #define TCE_VALID 0x800 /* TCE valid */
>>> #define TCE_ALLIO 0x400 /* TCE valid for all lpars */
>>> #define TCE_PCI_WRITE 0x2 /* write from PCI allowed */
>>> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
>>> index e4198700ed1a..8fe23b7dff3a 100644
>>> --- a/arch/powerpc/platforms/pseries/iommu.c
>>> +++ b/arch/powerpc/platforms/pseries/iommu.c
>>> @@ -107,6 +107,9 @@ static int tce_build_pSeries(struct iommu_table *tbl, long index,
>>> u64 proto_tce;
>>> __be64 *tcep;
>>> u64 rpn;
>>> + const unsigned long tceshift = tbl->it_page_shift;
>>> + const unsigned long pagesize = IOMMU_PAGE_SIZE(tbl);
>>> + const u64 rpn_mask = TCE_RPN_MASK(tceshift);
>>
>> Using IOMMU_PAGE_SIZE macro for the page size and not using
>> IOMMU_PAGE_MASK for the mask - this incosistency makes my small brain
>> explode :) I understand the history but maaaaan... Oh well, ok.
>>
>
> Yeah, it feels kind of weird after two IOMMU related consts. :)
> But sure IOMMU_PAGE_MASK() would not be useful here :)
>
> And this kind of let me thinking:
>>> + rpn = __pa(uaddr) >> tceshift;
>>> + *tcep = cpu_to_be64(proto_tce | (rpn & rpn_mask) << tceshift);
> Why not:
> rpn_mask = TCE_RPN_MASK(tceshift) << tceshift;
A mask for a page number (but not the address!) hurts my brain, masks
are good against addresses but numbers should already have all bits
adjusted imho, may be it is just me :-/
>
> rpn = __pa(uaddr) & rpn_mask;
> *tcep = cpu_to_be64(proto_tce | rpn)
>
> I am usually afraid of changing stuff like this, but I think it's safe.
>
>> Good, otherwise. Thanks,
>
> Thank you for reviewing!
>
>
>
--
Alexey