Re: [RESEND PATCH 1/2] dt-bindings: dma: Add reg-names to nvidia,tegra210-adma
From: Krzysztof Kozlowski
Date: Fri May 31 2024 - 03:43:48 EST
On 30/05/2024 14:48, Thierry Reding wrote:
> On Tue May 28, 2024 at 8:48 AM CEST, Krzysztof Kozlowski wrote:
>> On 24/05/2024 09:36, Thierry Reding wrote:
>>> On Wed May 22, 2024 at 1:29 PM CEST, Krzysztof Kozlowski wrote:
>>>> On 22/05/2024 09:43, Sameer Pujar wrote:
>>>>>
>>>>>
>>>>> On 22-05-2024 12:17, Krzysztof Kozlowski wrote:
>>>>>> On 22/05/2024 07:35, Sameer Pujar wrote:
>>>>>>> On 21-05-2024 17:23, Krzysztof Kozlowski wrote:
>>>>>>>> On 21/05/2024 13:08, Sameer Pujar wrote:
>>>>>>>>> From: Mohan Kumar <mkumard@xxxxxxxxxx>
>>>>>>>>>
>>>>>>>>> For Non-Hypervisor mode, Tegra ADMA driver requires the register
>>>>>>>>> resource range to include both global and channel page in the reg
>>>>>>>>> entry. For Hypervisor more, Tegra ADMA driver requires only the
>>>>>>>>> channel page and global page range is not allowed for access.
>>>>>>>>>
>>>>>>>>> Add reg-names DT binding for Hypervisor mode to help driver to
>>>>>>>>> differentiate the config between Hypervisor and Non-Hypervisor
>>>>>>>>> mode of execution.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Mohan Kumar <mkumard@xxxxxxxxxx>
>>>>>>>>> Signed-off-by: Sameer Pujar <spujar@xxxxxxxxxx>
>>>>>>>>> ---
>>>>>>>>> .../devicetree/bindings/dma/nvidia,tegra210-adma.yaml | 10 ++++++++++
>>>>>>>>> 1 file changed, 10 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/Documentation/devicetree/bindings/dma/nvidia,tegra210-adma.yaml b/Documentation/devicetree/bindings/dma/nvidia,tegra210-adma.yaml
>>>>>>>>> index 877147e95ecc..ede47f4a3eec 100644
>>>>>>>>> --- a/Documentation/devicetree/bindings/dma/nvidia,tegra210-adma.yaml
>>>>>>>>> +++ b/Documentation/devicetree/bindings/dma/nvidia,tegra210-adma.yaml
>>>>>>>>> @@ -29,8 +29,18 @@ properties:
>>>>>>>>> - const: nvidia,tegra186-adma
>>>>>>>>>
>>>>>>>>> reg:
>>>>>>>>> + description: |
>>>>>>>>> + For hypervisor mode, the address range should include a
>>>>>>>>> + ADMA channel page address range, for non-hypervisor mode
>>>>>>>>> + it starts with ADMA base address covering Global and Channel
>>>>>>>>> + page address range.
>>>>>>>>> maxItems: 1
>>>>>>>>>
>>>>>>>>> + reg-names:
>>>>>>>>> + description: only required for Hypervisor mode.
>>>>>>>> This does not work like that. I provide vm entry for non-hypervisor mode
>>>>>>>> and what? You claim it is virtualized?
>>>>>>>>
>>>>>>>> Drop property.
>>>>>>> With 'vm' entry added for hypervisor mode, the 'reg' address range needs
>>>>>>> to be updated to use channel specific region only. This is used to
>>>>>>> inform driver to skip global regions which is taken care by hypervisor.
>>>>>>> This is expected to be used in the scenario where Linux acts as a
>>>>>>> virtual machine (VM). May be the hypervisor mode gives a different
>>>>>>> impression here? Sorry, I did not understand what dropping the property
>>>>>>> exactly means here.
>>>>>> It was imperative. Drop it. Remove it. I provided explanation why.
>>>>>
>>>>> The driver doesn't know if it is operated in a native config or in the
>>>>> hypervisor config based on the 'reg' address range alone. So 'vm' entry
>>>>> with restricted 'reg' range is used to differentiate here for the
>>>>> hypervisor config. Just adding 'vm' entry won't be enough, the 'reg'
>>>>> region must be updated as well to have expected behavior. Not sure how
>>>>> this dependency can be enforced in the schema.
>>>>
>>>> That's not a unusual problem, so please come with a solution for your
>>>> entire subarch. We've been discussing similar topic in terms of SCMI
>>>> controlled resources (see talk on Linaro Connect a week ago:
>>>> https://www.kitefor.events/events/linaro-connect-24/submissions/161 I
>>>> don't know where is recording or slides, see also discussions on mailing
>>>> lists about it), which is not that far away from the problem here. Other
>>>> platforms and maybe nvidia had as well changes in IO space for
>>>> virtualized configuration.
>>>>
>>>> Come with unified approach FOR ALL your devices, not only this one
>>>> (that's kind of basic thing we keep repeating... don't solve only one
>>>> your problem), do not abuse the regular property, because as I said:
>>>> reg-names will be provided as well in non-vm case and then your entire
>>>> logic is wrong. The purpose of reg-names is not to tell whether you have
>>>> or have not virtualized environment.
>>>
>>> This isn't strictly about telling whether this is a virtualized
>>> environment or not. Unfortunately the bindings don't make that very
>>> clear, so let me try to give a bit more background.
>>>
>>> On Tegra devices the register regions associated with a device are
>>> usually split up into 64 KiB chunks.
>>
>> So describing it as one IO region was incorrect from the start and you
>> want to fix it by adding one more incorrect description: making first
>> item meaning two different things. Sorry, that's not a correct way to
>> fix things.
>
> Yes, describing this as one I/O region was incorrect, and in hindsight
> it should have been done differently.
>
> However, I don't think it's correct to describe this as adding one more
> incorrect description. Instead, what this does is add reg-names to
> provide additional context so that the operating system can make the
> necessary decisions as to what is allowed and what isn't.
>
> In the absence of a reg-names property the current definition of the DT
> bindings applies, so it means the region represents the entirety of the
> device's I/O register space. That's one particular use-case for this
> device.
>
> For additional use-cases we can then use reg-names to differentiate
> between what separate regions are and use them accordingly.
>
>> Items are defined, thus first item is always expected to be what the
>> binding already said. Adding reg-names changes nothing, because (as
>> repeated many times) xxx-names is just a helper. Items are already defined.
>
> I don't understand what you're trying to say here. I suppose adding
> reg-names alone indeed doesn't change anything. But the point is that
> once added we can now use these properties, at which point of course
> things change.
>
>>> One of these chunks, usually the first one, is a global region that
>>> contains registers that configure the device as a whole. This is usually
>>> privileged and accessible only to the hypervisor.
>>>
>>> Subsequent regions are meant to be assigned to individual VMs. Often the
>>> regions take the form of "channels", so they are instances of the same
>>> register block and control that separate slice of the hardware.
>>>
>>> What makes this a bit confusing is that for the sake of simplicity (and,
>>> I guess, lack of foresight) the original bindings were written in a way
>>> to encompass all registers without making that distinction. This worked
>>> fine because we've only ever run Linux as host OS where it has access to
>>> all those registers.
>>>
>>> However, when we move to virtualized environments that no longer works.
>>>
>>> Given the above, we can't read any registers in order to probe whether
>>> we run as a guest or not. Trying to access any of the global registers
>>> from a VM simply won't work and may crash the system. None of the
>>> "channel" registers contain information indicating host vs. guest
>>> either.
>>
>> I don't understand how it differs from what I said - you want to
>> indicate that you run in virtualized environment and not all resources
>> are accessible.
>>
>> The device still has the first (global) address, just it is not
>> available due to hypervisor.
>
> Yes, and that's a bad thing because there's no way for the device to
> know that it can't access the registers. So it will just assume that it
> can and try to access them, which would then result in a crash/error.
Different compatible could note that or the global address would be
removed from IO space, although then you need to rely on names and order
is not fixed. I think Rob already proposed different compatible.
This is also the way new Qcom platforms are going (older were using
properties).
However my earlier comment stays on: you will have for sure more cases
like this, so please think upfront and pick unified approach for all
future devices.
Best regards,
Krzysztof