Re: [PATCH v2 2/4] PCI: cadence: Use "dma-ranges" instead of "cdns,no-bar-match-nbits" property

From: Kishon Vijay Abraham I
Date: Fri May 08 2020 - 04:50:11 EST


Hi Rob,

On 5/8/2020 1:56 AM, Rob Herring wrote:
> On Wed, May 06, 2020 at 08:52:13AM +0530, Kishon Vijay Abraham I wrote:
>> Hi Robin,
>>
>> On 5/4/2020 6:23 PM, Kishon Vijay Abraham I wrote:
>>> Hi Robin,
>>>
>>> On 5/4/2020 4:24 PM, Robin Murphy wrote:
>>>> On 2020-05-04 9:44 am, Kishon Vijay Abraham I wrote:
>>>>> Hi Robin,
>>>>>
>>>>> On 5/1/2020 9:24 PM, Robin Murphy wrote:
>>>>>> On 2020-05-01 3:46 pm, Lorenzo Pieralisi wrote:
>>>>>>> [+Robin - to check on dma-ranges intepretation]
>>>>>>>
>>>>>>> I would need RobH and Robin to review this.
>>>>>>>
>>>>>>> Also, An ACK from Tom is required - for the whole series.
>>>>>>>
>>>>>>> On Fri, Apr 17, 2020 at 05:13:20PM +0530, Kishon Vijay Abraham I wrote:
>>>>>>>> Cadence PCIe core driver (host mode) uses "cdns,no-bar-match-nbits"
>>>>>>>> property to configure the number of bits passed through from PCIe
>>>>>>>> address to internal address in Inbound Address Translation register.
>>>>>>>>
>>>>>>>> However standard PCI dt-binding already defines "dma-ranges" to
>>>>>>>> describe the address range accessible by PCIe controller. Parse
>>>>>>>> "dma-ranges" property to configure the number of bits passed
>>>>>>>> through from PCIe address to internal address in Inbound Address
>>>>>>>> Translation register.
>>>>>>>>
>>>>>>>> Signed-off-by: Kishon Vijay Abraham I <kishon@xxxxxx>
>>>>>>>> ---
>>>>>>>> ÂÂ drivers/pci/controller/cadence/pcie-cadence-host.c | 13 +++++++++++--
>>>>>>>> ÂÂ 1 file changed, 11 insertions(+), 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/pci/controller/cadence/pcie-cadence-host.c
>>>>>>>> b/drivers/pci/controller/cadence/pcie-cadence-host.c
>>>>>>>> index 9b1c3966414b..60f912a657b9 100644
>>>>>>>> --- a/drivers/pci/controller/cadence/pcie-cadence-host.c
>>>>>>>> +++ b/drivers/pci/controller/cadence/pcie-cadence-host.c
>>>>>>>> @@ -206,8 +206,10 @@ int cdns_pcie_host_setup(struct cdns_pcie_rc *rc)
>>>>>>>> ÂÂÂÂÂÂ struct device *dev = rc->pcie.dev;
>>>>>>>> ÂÂÂÂÂÂ struct platform_device *pdev = to_platform_device(dev);
>>>>>>>> ÂÂÂÂÂÂ struct device_node *np = dev->of_node;
>>>>>>>> +ÂÂÂ struct of_pci_range_parser parser;
>>>>>>>> ÂÂÂÂÂÂ struct pci_host_bridge *bridge;
>>>>>>>> ÂÂÂÂÂÂ struct list_head resources;
>>>>>>>> +ÂÂÂ struct of_pci_range range;
>>>>>>>> ÂÂÂÂÂÂ struct cdns_pcie *pcie;
>>>>>>>> ÂÂÂÂÂÂ struct resource *res;
>>>>>>>> ÂÂÂÂÂÂ int ret;
>>>>>>>> @@ -222,8 +224,15 @@ int cdns_pcie_host_setup(struct cdns_pcie_rc *rc)
>>>>>>>> ÂÂÂÂÂÂ rc->max_regions = 32;
>>>>>>>> ÂÂÂÂÂÂ of_property_read_u32(np, "cdns,max-outbound-regions",
>>>>>>>> &rc->max_regions);
>>>>>>>> ÂÂ -ÂÂÂ rc->no_bar_nbits = 32;
>>>>>>>> -ÂÂÂ of_property_read_u32(np, "cdns,no-bar-match-nbits", &rc->no_bar_nbits);
>>>>>>>> +ÂÂÂ if (!of_pci_dma_range_parser_init(&parser, np))
>>>>>>>> +ÂÂÂÂÂÂÂ if (of_pci_range_parser_one(&parser, &range))
>>>>>>>> +ÂÂÂÂÂÂÂÂÂÂÂ rc->no_bar_nbits = ilog2(range.size);
>>>>>>
>>>>>> You probably want "range.pci_addr + range.size" here just in case the bottom of
>>>>>> the window is ever non-zero. Is there definitely only ever a single inbound
>>>>>> window to consider?
>>>>>
>>>>> Cadence IP has 3 inbound address translation registers, however we use only 1
>>>>> inbound address translation register to map the entire 32 bit or 64 bit address
>>>>> region.
>>>>
>>>> OK, if anything that further strengthens the argument for deprecating a single
>>>> "number of bits" property in favour of ranges that accurately describe the
>>>> window(s). However it also suggests that other users in future might have some
>>>> expectation that specifying "dma-ranges" with up to 3 entries should work to
>>>> allow a more restrictive inbound configuration. Thus it would be desirable to
>>>> make the code a little more robust here - even if we don't support multiple
>>>> windows straight off, it would still be better to implement it in a way that
>>>> can be cleanly extended later, and at least say something if more ranges are
>>>> specified rather than just silently ignoring them.
>>>
>>> I looked at this further in the Cadence user doc. The three inbound ATU entries
>>> are for BAR0, BAR1 in RC configuration space and the third one is for NO MATCH
>>> BAR when there is no matching found in RC BARs. Right now we always configure
>>> the NO MATCH BAR. Would it be possible describe at BAR granularity in dma-ranges?
>>
>> I was thinking if I could use something like
>> dma-ranges = <0x02000000 0x0 0x0 0x0 0x0 0x00000 0x0>, //For BAR0 IB mapping
>> <0x02000000 0x0 0x0 0x0 0x0 0x00000 0x0>, //For BAR1 IB mapping
>> <0x02000000 0x0 0x0 0x0 0x0 0x10000 0x0>; //NO MATCH BAR
>>
>> This way the driver can tell the 1st tuple is for BAR0, 2nd is for BAR1 and
>> last is for NO MATCH. In the above case both BAR0 and BAR1 is just empty and
>> doesn't have valid values as we use only the NO MATCH BAR.
>>
>> However I'm not able to use for_each_of_pci_range() in Cadence driver to get
>> the configuration for each BAR, since the for loop gets invoked only once since
>> of_pci_range_parser_one() merges contiguous addresses.
>
> NO_MATCH_BAR could just be the last entry no matter how many? Who cares
> if they get merged? Maybe each BAR has max size and dma-ranges could
> exceed that, but if so you have to handle that and split them again.

Each of RP_BAR0, RP_BAR1 and RP_NO_BAR has separate register to be configured.
If they get merged, we'll loose info on which of the registers to be
configured. Cadence IP specifies maximum size of BAR0 as 256GB, maximum size of
BAR1 as 2 GB. However when I specify dma-ranges like below and use
for_each_of_pci_range(&parser, &range), the first range itself is 258.

dma-ranges = <0x02000000 0x00 0x0 0x00 0x0 0x40 0x00000000>, /* BAR0 256 GB */
<0x02000000 0x40 0x0 0x40 0x0 0x00 0x80000000>; /* BAR1 2 GB */
>
>> Do you think I should extend the flags cell to differentiate between BAR0, BAR1
>> and NO MATCH BAR? Can you suggest any other alternatives?
>
> If you just have 1 region, then just 1 entry makes sense to me. Why
> can't you use BAR0 in that case?

Well, Cadence has specified a max size for each BAR. I think we could specify a
single region (48 bits in my case) in dma-ranges and let the driver decide how
to split it among BAR0, BAR1 and NO_MATCH_BAR?

Thanks
Kishon