Re: [PATCHv6 02/16] drivers: thermal: introduce device tree parser

From: Eduardo Valentin
Date: Tue Sep 24 2013 - 11:51:51 EST


Zhang,

I appreciate your interest. I am replying a couple of your comments.
Please also check my answers to Mark's queries on other email thread.

On 24-09-2013 04:11, Hongbo Zhang wrote:
> On 09/23/2013 06:40 PM, Mark Rutland wrote:
>> Hi Eduardo,
>>
>> Apologies for having taken so long to get back you on this.
>>
>> I have several comments on the binding and the way it's parsed.
>>
>> On Wed, Sep 18, 2013 at 10:35:36PM +0100, Eduardo Valentin wrote:
>>> This patch introduces a device tree bindings for
>>> describing the hardware thermal behavior and limits.
>>> Also a parser to read and interpret the data and feed
>>> it in the thermal framework is presented.
>>>
>>> This patch introduces a thermal data parser for device
>>> tree. The parsed data is used to build thermal zones
>>> and thermal binding parameters. The output data
>>> can then be used to deploy thermal policies.
>>>
>>> This patch adds also documentation regarding this
>>> API and how to define tree nodes to use
>>> this infrastructure.
>>>
>>> Note that, in order to be able to have control
>>> on the sensor registration on the DT thermal zone,
>>> it was required to allow changing the thermal zone
>>> .get_temp callback. For this reason, this patch
>>> also removes the 'const' modifier from the .ops
>>> field of thermal zone devices.
>>>
>>> Cc: Zhang Rui <rui.zhang@xxxxxxxxx>
>>> Cc: linux-pm@xxxxxxxxxxxxxxx
>>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>>> Signed-off-by: Eduardo Valentin <eduardo.valentin@xxxxxx>
>>> ---
>>> .../devicetree/bindings/thermal/thermal.txt | 498 +++++++++++++
>>> drivers/thermal/Kconfig | 13 +
>>> drivers/thermal/Makefile | 1 +
>>> drivers/thermal/of-thermal.c | 775
>>> +++++++++++++++++++++
>>> drivers/thermal/thermal_core.c | 9 +-
>>> drivers/thermal/thermal_core.h | 9 +
>>> include/dt-bindings/thermal/thermal.h | 27 +
>>> include/linux/thermal.h | 28 +-
>>> 8 files changed, 1357 insertions(+), 3 deletions(-)
>>> create mode 100644
>>> Documentation/devicetree/bindings/thermal/thermal.txt
>>> create mode 100644 drivers/thermal/of-thermal.c
>>> create mode 100644 include/dt-bindings/thermal/thermal.h
>>>
>>> ---
>>>
>>> Hello all,
>>>
>>> Thanks Joe Perches for the effort of reviewing this code, I really
>>> appreciate it.
>>>
>>> Here is a version with a refactored init function without leaks in
>>> the failure
>>> patch. I also added a couple of comments to better understand the
>>> intention of
>>> that function. Besides, when there are no memory available, the function
>>> rolls back what ever thermal zones have been added.
>>>
>>> Thanks,
>>>
>>> Eduardo
>>>
>>> diff --git a/Documentation/devicetree/bindings/thermal/thermal.txt
>>> b/Documentation/devicetree/bindings/thermal/thermal.txt
>>> new file mode 100644
>>> index 0000000..6664533
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/thermal/thermal.txt
>>> @@ -0,0 +1,498 @@
>>> +* Thermal Framework Device Tree descriptor
>>> +
>>> +Generic binding to provide a way of defining hardware thermal
>>> +structure using device tree. A thermal structure includes thermal
>>> +zones and their components, such as trip points, polling intervals,
>>> +sensors and cooling devices binding descriptors.
>>> +
>>> +The target of device tree thermal descriptors is to describe only
>>> +the hardware thermal aspects, not how the system must control or which
>>> +algorithm or policy must be taken in place.
>>> +
>>> +There are five types of nodes involved to describe thermal bindings:
>>> +- sensors: used to describe the device source of temperature sensing;
>>> +- cooling devices: used to describe devices source of power
>>> dissipation control;
>>> +- trip points: used to describe points in temperature domain defined to
>>> +make the system aware of hardware limits;
>>> +- cooling attachments: used to describe links between trip points and
>>> +cooling devices;
>> I think "attachments" sounds a bit odd, though I don't have a better
>> naming suggestion.
>
> "map" or "mapping" is better?

Well, yeah, that could work.

>
>>> +- thermal zones: used to describe thermal data within the hardware;
>>> +
>>> +It follows a description of each type of these device tree nodes.
>>> +
>>> +* Sensor devices
>>> +
>>> +Sensor devices are nodes providing temperature sensing capabilities
>>> on thermal
>>> +zones. Typical devices are I2C ADC converters and bandgaps. Theses
>>> are nodes
>>> +providing temperature data to thermal zones. Temperature sensor
>>> devices may
>>> +control one or more internal sensors.
>>> +
>>> +Required property:
>>> +- #sensor-cells: Used to provide sensor device specific
>>> information
>>> + while referring to it. Must be at least 1, in
>>> order
>>> + to identify uniquely the sensor instances within
>>> + the IC. See thermal zone binding for more
>>> details
>>> + on how consumers refer to sensor devices.
>> I don't see why this needs to be at least one cell -- if an IC only has
>> one temperature sensor it should be fine for this to be zero and have no
>> ambiguity.
>>
>> It may make sense to call these thermal sensor devices, there are plenty
>> of other sensors we might want to describe in the dt for other purposes,
>> and we can impose some consistency if we remove the ambiguity.
>>
>>> +
>>> +* Cooling device nodes
>>> +
>>> +Cooling devices are nodes providing control on power dissipation. There
>>> +are essentially two ways to provide control on power dissipation. First
>>> +is by means of regulating device performance, which is known as passive
>>> +cooling. Second is by means of activating devices in order to remove
>>> +the dissipated heat, which is known as active cooling, e.g. regulating
>>> +fan speeds. In both cases, cooling devices shall have a way to
>>> determine
>>> +the level of cooling.
>>> +
>>> +Required property:
>>> +- cooling-min-level: A unsigned integer indicating the smallest
>>> + cooling level accepted. Typically 0.
>>> +- cooling-max-level: An unsigned integer indicating the largest
>>> + cooling level accepted.
>> I'm not sure what a "cooling level" means. It seems a bit abstract. Is
>> this binding specific?
>
> It means cooling ability of a cooling device, such as speed of a fan.
> But it is better to use cooling-min/max-state I think, because the
> thermal management code is using "cooling state", concepts in dt and c
> should be identical.

I think the idea is just to be clear. Just because the thermal
management implementation uses a terminology, it does not necessarily
mean it is best fit for describing in device tree bindings. But I dont
really have strong opinion on level or state at this point.

>
>> How does this relate to cooling-cells? These seem to be a fixed size.
>
> I also think cooling-cells is redundant.

Cooling cells is not redundant, please check my explanation on the other
thread. It is required to determine which level/states are used while
binding.

>
>>> +- #cooling-cells: Used to provide cooling device specific
>>> information
>>> + while referring to it. Must be at least 2, in
>>> order
>>> + to specify minimum and maximum cooling level
>>> used
>>> + in the reference. See Cooling device
>>> attachments section
>>> + below for more details on how consumers refer to
>>> + cooling devices.
>> Are these cooling cells always expected to cover min and max, and are
>> min and max always expected to take the same number of cells? The above
>> cooling-*-level properties imply they each take up one cell.
>>
>> Does #cooling-cells = <3> make sense?. If this covers additional
>> information (e.g. which fan on a multi-fan controller), how does the OS
>> determine which cells are min and max, and how do these relate to
>> cooling-level-min and cooling-level-max?
>>
>>> +
>>> +* Trip points
>>> +
>>> +The trip node is a node to describe a point in the temperature domain
>>> +in which the system takes an action. This node describes just the
>>> point,
>>> +not the action.
>> I'm still not sure on this, as it sounds like embedding policy into the
>> DT, rather than a description of the hardware. I realise we need to
>> prevent devices burning out, so describing the max acceptable
>> temperature and possibly a preferred range makes sense, but do we need
>> this level of flexibility? Maybe we do.
>>
> Putting platform data into dt is acceptable, right?
> In a narrow sense, trip points are platform data, but in a broad sense,
> both trip point itself and its corresponding cooling device are all
> platform data.
>

I think Mark needs to elaborate a bit more to clarify why he sees this
as embedding policy into DT. And I believe it is fine to pass platform
data, as long as it represents hardware and it is not representing policy.

>>> +
>>> +Required properties:
>>> +- temperature: the trip temperature level, in milliCelsius.
>> Preferably, s/milliCelsius/millicelsius/ over the document.
>>
>> Given that we have signed values elsewhere, is this signed or unsigned?
>>
>>> +- hysteresis: a (low) hysteresis value on 'temperature'.
>>> This is a
>>> + relative value, in milliCelsius.
>>> +- type: the trip type. Here is the type mapping:
>>> + THERMAL_TRIP_ACTIVE 0: A trip point to enable active
>>> cooling
>>> + THERMAL_TRIP_PASSIVE 1: A trip point to enable
>>> passive cooling
>>> + THERMAL_TRIP_HOT 2: A trip point to notify emergency
>>> + THERMAL_TRIP_CRITICAL 3: Hardware not reliable.
>>> +
>>> +Refer to include/dt-bindings/thermal/thermal.h for definition of
>>> these consts.
>> Why not use a string for this? We do so for other type parameters in
>> other bindings (see dr_mode in
>> Documentation/devicetree/bindings/usb/generic.txt, phy-mode for ethernet
>> PHYs, etc).
>>
>>> +
>>> +* Cooling device attachments
>>> +
>>> +The cooling device attachments node is a node to describe how
>>> cooling devices
>>> +get assigned to trip points of the zone. The cooling devices are
>>> expected
>>> +to be loaded in the target system.
>>> +
>>> +Required properties:
>>> +- cooling-device: A phandle of a cooling device with its
>>> parameters,
>>> + referring to which cooling device is used in
>>> this
>>> + binding. The required parameters are: the
>>> minimum
>>> + cooling level and the maximum cooling level used
>>> + in this attach.
>> Might this be a list in general? The code doesn't have to support that
>> yet.
>>
>> It would be nice to have a name for the cells after a phandle which
>> describe the cooling device's configuration. You've called them
>> parameters here, but it probably makes sense to call them a
>> cooling-specifier (following clock-specifier and interrupt-specifier).
>>
>>> +- trip: A phandle of a trip point node within
>>> the same thermal
>>> + zone.
>>> +
>>> +Optional property:
>>> +- contribution: The cooling contribution to the
>>> thermal zone of the
>>> + referred cooling device at the referred trip
>>> point.
>>> + The contribution is a value from 0 to 100.
>>> The sum
>>> + of all cooling contributions within a thermal
>>> zone
>>> + must never exceed 100.
>> This is a bit arbitrary. Couldn't we sum all contributions to find the
>> total automatically, so this could be a simpler ratio?
>>
>>> +
>>> +Note: Using the THERMAL_NO_LIMIT (-1L) constant in the
>>> cooling-device phandle
>>> +limit parameters means:
>>> +(i) - minimum level allowed for minimum cooling level used in the
>>> reference.
>>> +(ii) - maximum level allowed for maximum cooling level used in the
>>> reference.
>>> +Refer to include/dt-bindings/thermal/thermal.h for definition of
>>> this constant.
>>> +
>>> +* Thermal zones
>>> +
>>> +The thermal-zone node is the node containing all the required info
>>> +for describing a thermal zone, including its cdev bindings. The
>>> thermal_zone
>>> +node must contain, apart from its own properties, one node containing
>>> +trip nodes and one node containing all the zone cooling attachments.
>> s/cdev/cooling device/ ?
>>
>>> +
>>> +Required properties:
>>> +- passive-delay: The maximum number of milliseconds to wait
>>> between polls
>>> + when performing passive cooling.
>>> +- polling-delay: The maximum number of milliseconds to wait
>>> between polls
>>> + when checking this thermal zone.
>> How about polling-delay-passive, polling-delay-active? I'm still not
>>
>>> +- sensors: A list of sensor phandles and their
>>> parameters. The
>>> + required parameter is the sensor id, in order to
>>> + identify internal sensors when the sensor IC
>>> features
>>> + several sensing units.
>> As mentioned above, I'm not sure you even need that one cell.
>>
>> - sensors: A list of sensor phandle +
>> thermal-sensor-specifier
>> cells describing the sensors monitoring the
>> thermal
>> zone.
>>
>>> +- trips: A sub-node containing several trip point
>>> nodes required
>>> + to describe the thermal zone.
>>> +- cooling-attachments A sub-node containing several cooling device
>>> attaches
>>> + nodes, used to describe the relation between
>>> trips
>>> + and cooling devices.
>>> +
>>> +Optional property:
>>> +- coefficients: An array of integers (one signed
>>> cell) containing
>>> + coefficients to compose a linear relation
>>> between
>>> + the sensors described in the sensors property.
>>> + Coefficients defaults to 1, in case this
>>> property
>>> + is not specified. A simple linear polynomial
>>> is used:
>>> + Z = c0 * x0 + c1 + x1 + ... + c(n-1) * x(n-1)
>>> + cn.
>>> +
>>> + The coefficients are ordered and they match
>>> with sensors
>>> + by means of sensor ID. Additional
>>> coefficients are
>>> + interpreted as constant offsets.
>> What is the end result of this? Presumably this is meant to result in an
>> estimate of the average temperature of the thermal zone, rather than an
>> arbitrary value? This should be mentioned.
>>
>> This doesn't seem to be used the in the code below...
>>
>>> +
>>> +Note: The delay properties are bound to the maximum dT/dt (temperature
>>> +derivative over time) in two situations for a thermal zone:
>>> +(i) - when active cooling is activated (passive-delay); and
>>> +(ii) - when the zone just needs to be monitored (polling-delay).
>>> +The maximum dT/dt is highly bound to hardware power consumption and
>>> dissipation
>>> +capability.
>> I'm not sure what you mean by this, could you elaborate?
>>
>> I guess you mean that the delays should be chosen to account for said
>> max dT/dt, such that a device can't unexpectedly cross several trip
>> boundaries between polls?
>>
>>> +
>>> +* Examples
>>> +
>>> +Below are several examples on how to use thermal data descriptors
>>> +using device tree bindings:
>>> +
>>> +(a) - CPU thermal zone
>>> +
>>> +The CPU thermal zone example below describes how to setup one
>>> thermal zone
>>> +using one single sensor as temperature source and many cooling
>>> devices and
>>> +power dissipation control sources.
>>> +
>>> +#include <dt-bindings/thermal/thermal.h>
>>> +
>>> +cpus {
>>> + cpu0: cpu@0 {
>>> + ...
>>> + cooling-min-level = <0>;
>>> + cooling-max-level = <3>;
>>> + #cooling-cells = <2>; /* min followed by max */
>>> + };
>> What do those min and max mean in this context? What do the values in
>> the range [0,3] correspond to?
>>
>> I'm not sure it makes sense to describe passive cooling in this way --
>> the precise way an OS does less work on a device is fundamentally tied
>> to the design of the OS (it might be able to rate-limit requests, it
>> might be able to clock the device down, it might only be able to turn
>> the device off).
>>
>> I'm not sure there is anything we can describe for passive cooling
>> (beyond the thermal limits the OS should attempt to stick to).
>>
>>> + ...
>>> +};
>>> +
>>> +&i2c1 {
>>> + ...
>>> + fan0: fan@0x48 {
>>> + ...
>>> + cooling-min-level = <0>;
>>> + cooling-max-level = <9>;
>>> + #cooling-cells = <2>; /* min followed by max */
>>> + };
>> What do min and max mean here for the fan?
>>
> It means the speed level of a fan. Even if a fan can run at a continuous
> speed range, but when it is controlled under thermal management
> framework, it can only run at discrete speeds.


Please, an important point here, this is not about thermal framework
requirement. But how we describe hardware. As you mentioned, even if
there are fan devices out there that uses continuous speed range, there
other devices, even fan devices or cpu frequency steps for instance,
that are discrete. Again, *this is not a thermal framework requirement*.

>>> +};
>>> +
>>> +bandgap0: bandgap@0x0000ED00 {
>>> + ...
>>> + #sensor-cells = <1>;
>>> +};
>>> +
>>> +cpu-thermal: cpu-thermal {
>> How do the thermal zones get probed?
>>
>>> + passive-delay = <250>; /* milliseconds */
>>> + polling-delay = <1000>; /* milliseconds */
>>> +
>>> + /* sensor ID */
>>> + sensors = <&bandgap0 0>;
>>> +
>>> + trips {
>>> + cpu-alert0: cpu-alert {
>>> + temperature = <90000>; /* milliCelsius */
>>> + hysteresis = <2000>; /* milliCelsius */
>>> + type = <THERMAL_TRIP_ACTIVE>;
>>> + };
>>> + cpu-alert1: cpu-alert {
>>> + temperature = <100000>; /* milliCelsius */
>>> + hysteresis = <2000>; /* milliCelsius */
>>> + type = <THERMAL_TRIP_PASSIVE>;
>>> + };
>>> + cpu-crit: cpu-crit {
>>> + temperature = <125000>; /* milliCelsius */
>>> + hysteresis = <2000>; /* milliCelsius */
>>> + type = <THERMAL_TRIP_CRITICAL>;
>>> + };
>>> + };
>>> +
>>> + cooling-attachments {
>>> + attach0 {
>>> + trip = <&cpu-alert0>;
>>> + cooling-device = <&fan0 THERMAL_NO_LIMITS 4>;
>>> + };
>>> + attach1 {
>>> + trip = <&cpu-alert1>;
>>> + cooling-device = <&fan0 5 THERMAL_NO_LIMITS>;
>>> + };
>>> + attach2 {
>>> + trip = <&cpu-alert1>;
>>> + cooling-device =
>>> + <&cpu0 THERMAL_NO_LIMITS
>>> THERMAL_NO_LIMITS>;
>>> + };
>>> + };
>> Was there a good reason for splitting trips and attachment?
>
> This is for adding/removing cooling device dynamically.

Well not really. I don't think this would prevent us of representing it
inside the trip point. It is, as I replied to Mark, because I wanted to
keep the flexibility to allow describing better properties related to
the cooling device while associated to a trip point alone, e.g.
'contribution'.

>
>>> +};
>>> +
>>> +In the example above, the ADC sensor at address 0x0000ED00 is used
>>> to monitor
>>> +the zone 'cpu-thermal' using its the sensor 0. The fan0, a fan
>>> device controlled
>>> +via I2C bus 1, at adress 0x48, is used to remove the heat out of the
>>> thermal
>>> +zone 'cpu-thermal' using its cooling levels from its minimum to 4,
>>> when it
>>> +reaches trip point 'cpu-alert0' at 90C, as an example of active
>>> cooling. The
>>> +same cooling device is used at 'cpu-alert1', but from 5 to its
>>> maximum level.
>>> +The cpu@0 device is also linked to the same thermal zone,
>>> 'cpu-thermal', as a
>>> +passive cooling device, using all its cooling levels at trip point
>>> 'cpu-alert1',
>>> +which is a trip point at 100C.
>>> +
>>
>
>
>
>
>


--
You have got to be excited about what you are doing. (L. Lamport)

Eduardo Valentin

Attachment: signature.asc
Description: OpenPGP digital signature