Re: [PATCH 1/3] of/pci/dma: fix DMA configuration for PCI masters

From: Oza Oza
Date: Tue May 16 2017 - 01:24:21 EST


On Sat, May 6, 2017 at 11:00 AM, Oza Oza <oza.oza@xxxxxxxxxxxx> wrote:
> On Fri, May 5, 2017 at 8:55 PM, Robin Murphy <robin.murphy@xxxxxxx> wrote:
>> On 04/05/17 19:41, Oza Oza wrote:
>> [...]
>>>>> 5) leaves scope of adding PCI flag handling for inbound memory
>>>>> by the new function.
>>>>
>>>> Which flags would ever actually matter? DMA windows aren't going to be
>>>> to config or I/O space, so the memory type can be assumed, and the
>>>> 32/64-bit distinction is irrelevant as it's not a relocatable BAR;
>>>> DMA-able system memory isn't going to be read-sensitive, so the
>>>> prefetchable flag shouldn't matter; and not being a BAR none of the
>>>> others would be relevant either.
>>>>
>>>
>>> Thanks Robin; for your reply and attention:
>>>
>>> agree with you, at present it would not matter,
>>> but it does not mean that we do not scope it to make it matter in future.
>>>
>>> now where it could matter:
>>> there is Relaxed Ordering for inbound memory for PCI.
>>> According to standard, Relaxed Ordering (RO) bit can be set only for
>>> Memory requests and completions (if present in the original request).
>>> Also, according to transaction ordering rules, I/O and configuration
>>> requests can still be re-ordered ahead of each other.
>>> and we would like to make use of it.
>>> for e.g. lets say we mark memory as Relaxed Ordered with flag.
>>> the special about this memory is incoming PCI transactions can be
>>> reordered and rest memory has to be strongly ordered.
>>
>> Please look at "PCI Bus Binding to: IEEE Std 1275-1994 Standard for Boot
>> (Initialization Configuration) Firmware" (as referenced in DTSpec) and
>> explain how PCIe Relaxed Order has anything to do with the DT binding.
>>
>>> how it our SOC would make use of this is out of scope for the
>>> discussion at this point of time, but I am just bringing in the
>>> idea/point how flags could be useful
>>> for inbound memory, since we would not like throw-away flags completely.
>>
>> The premise for implementing a PCI-specific parser is that you claim we
>> need to do something with the phys.hi cell of a DT PCI address, rather
>> than just taking the numerical part out of the phys.mid and phys.lo
>> cells. Please make that argument in reference to the flags which that
>> upper cell actually encodes, not unrelated things.
>>
>
> I think, the whole discussion around inbound flags is not what I
> intended to bring.
> this patch does nothing about inbound flag and never intends to solve
> anything regarding inbound flags.
> infact I would like to remove point 5 form the commit message. which
> should keep it out of discussion completely.
>
> let met tell what this patch is trying to address/solve 2 BUGs
> 1) fix wrong size return from of_dma_configure for PCI masters. (which
> is right now BUG)
>
> 2) handles multiple dma-ranges cleanly
>
> 3) It takes care of dma-ranges being optional.
>
> 4) following is the comment on function of_dma_get_range (which is also a BUG)
> "It returns -ENODEV if "dma-ranges" property was not found
> * for this device in DT."
> which I think is wrong for PCI device, because if dma-ranges are
> absent then instead of returning -ENODEV,
> it should return 0 with largest possible host memory.
>
> it solves all the above 4 problems.
>
>> [...]
>>>>> +int of_pci_get_dma_ranges(struct device_node *np, struct list_head *resources)
>>>>> +{
>>>>> + struct device_node *node = of_node_get(np);
>>>>> + int rlen;
>>>>> + int ret = 0;
>>>>> + const int na = 3, ns = 2;
>>>>> + struct resource *res;
>>>>> + struct of_pci_range_parser parser;
>>>>> + struct of_pci_range range;
>>>>> +
>>>>> + if (!node)
>>>>> + return -EINVAL;
>>>>> +
>>>>> + parser.node = node;
>>>>> + parser.pna = of_n_addr_cells(node);
>>>>> + parser.np = parser.pna + na + ns;
>>>>> +
>>>>> + parser.range = of_get_property(node, "dma-ranges", &rlen);
>>>>> +
>>>>> + if (!parser.range) {
>>>>> + pr_debug("pcie device has no dma-ranges defined for node(%s)\n",
>>>>> + np->full_name);
>>>>> + ret = -EINVAL;
>>>>> + goto out;
>>>>> + }
>>>>> +
>>>>> + parser.end = parser.range + rlen / sizeof(__be32);
>>>>> +
>>>>> + for_each_of_pci_range(&parser, &range) {
>>>>
>>>> This is plain wrong - of_pci_range_parser_one() will translate upwards
>>>> through parent "ranges" properties, which is completely backwards for
>>>> DMA addresses.
>>>>
>>>> Robin.
>>>>
>>>
>>> No it does not, this patch is thoroughly tested on our SOC and it works.
>>> of_pci_range_parser_one does not translate upwards through parent. it
>>> just sticks to given PCI parser.
>>
>> Frankly, I'm losing patience with this attitude. Please look at the code
>> you call:
>>
>> #define for_each_of_pci_range(parser, range) \
>> for (; of_pci_range_parser_one(parser, range);)
>>
>>
>> struct of_pci_range *of_pci_range_parser_one(struct of_pci_range_parser
>> *parser,
>> struct of_pci_range *range)
>> {
>> const int na = 3, ns = 2;
>>
>> if (!range)
>> return NULL;
>>
>> if (!parser->range || parser->range + parser->np > parser->end)
>> return NULL;
>>
>> range->pci_space = parser->range[0];
>> range->flags = of_bus_pci_get_flags(parser->range);
>> range->pci_addr = of_read_number(parser->range + 1, ns);
>> range->cpu_addr = of_translate_address(parser->node,
>> parser->range + na);
>> ...
>>
>>
>> u64 of_translate_address(struct device_node *dev, const __be32 *in_addr)
>> {
>> return __of_translate_address(dev, in_addr, "ranges");
>> }
>>
>>
>> I don't doubt that you still manage to get the right result on *your*
>> SoC, because you probably have neither further "ranges" nor "dma-ranges"
>> translations above your host controller node anyway. That does not
>> change the fact that the proposed code is still obviously wrong for more
>> complex DT topologies that do.
>
> sorry but I am confused, and sorry again for not getting through on
> what you said.
>
> this patch assumes that the root bus would have dma-ranges.
> are you saying this code doesn't iterate through way up till it finds
> valid dma-ranges ?
>
> or
>
> you are saying
> of_pci_range_parser_one() will translate upwards
> through parent "ranges" properties
>
>>
>> We're doing upstream work in core code here: I don't particularly care
>> about making your SoC work; I don't really care about making Juno work
>> properly either; what I do care about is that code to parse dma-ranges
>> actually parses dma-ranges *correctly* for all possible valid uses of
>> dma-ranges, which means fixing the existing bugs and not introducing
>> more. The principal side-effect of that is that *all* systems with valid
>> DTs will then work correctly.
>>
>
> I do see your point now.....and my apologies for not getting it right
> at the first time.
>
> but I would not know all the nitty-gritty of all the code of framework and
> every complex DT topology of other SOCs.
> that is the reason we seek for comments from experts like you, to make
> the patch better.
>
> when I say it works on our SOC, I only meant that this patch is
> tested. so again apologies there.
>
> there is one obvious problem is
> of_translate_dma_address should get called instead of
> of_translate_address (so no "ranges") instead ("dma-ranges")
>
> but with that also as you said, it will traverse all the way up to the
> DT hierarchy.
>
> I think there are 2 problems with this patch.
>
> 1) this patch should try to iterate through all the way up to find
> first dma-ranges and should stop when it finds it.
> it should not assume that dma-ranges will always be found at the
> current node.
>
> 2) of_translate_dma_address is broken, because if point 1 is achieved,
> then no need to traverse anymore.
>
> but before that, again seeking your opinion, whether we want to go
> down this path.
> registering bus specific get_ranges as in PATCH v5 ? in my opinion it
> is better way of handling it.
>
> the original patch which you had in mind, you will have to club both
> PCI and memory -mapped implementation together.
> even if dma-ranges (ignoring flags) remain the same in nature, still
> you have to parse it differently. because address-cells are different.
> and you will have to handle multiple ranges.
>
> I just tried to bring it out to different path with registering bus
> specific callbacks.
>
>> Robin.

Hi Robin,

I have addressed your comments to the best of my understanding.
please have a look at PATCH v6

Regards,
Oza.