Re: RFC: device thermal limits represented in device tree nodes

From: Stephen Warren
Date: Tue Jul 23 2013 - 21:44:50 EST


On 07/22/2013 07:25 AM, Eduardo Valentin wrote:
> Hello Grant and Rob,
>
> (Resending, as I got a message saying:
> <devicetree-discuss@xxxxxxxxxxxxxxxx>: Recipient address rejected:
> User has moved to devicetree at vger.kernel.org)
>
> I am writing this email to you specifically to ask your technical
> assessment with respect to representing device thermal limits as
> device tree nodes. I am proposing to introduce device tree nodes to
> describe these limits as thermal zones, their composition and their
> relations with cooling devices and other thermal zones (thermal
> data).

Given:
https://lkml.org/lkml/2013/7/20/69
[PATCH 3/3] MAINTAINERS: Refactor device tree maintainership

I'm explicitly CCing a few people besides Grant/Rob, and qouting the
whole email.

>From my perspective, the concept of including thermal limits in DT
seems reasonable, although I haven't looked at the proposed binding
itself in detail yet.

> As you should know, device thermal limits are part of hardware
> specification. Considering your board layout, mechanics, power
> dissipation and composition of ICs, etc, that will impose thermal
> requirements on your system, and infringing these limits can lead
> to device damage, device life time reduction or even end user harm.
> Thus, the thermal data help to describe the hardware limits and
> what needs to be done if those limits are crosses, as part of your
> board design and non-functional requirements. Obviously that is
> very dependent on your hardware, and not all of them will have
> these non-functional requirements. Besides, describing these limits
> has *nothing* to do with how you actually find these limits.
>
> In any case, there is a need to properly represent these
> requirements and I am proposing to have this representation in
> device tree. There were already couple of counter-arguments
> claiming this is actually about configuration and performance
> profile description. But I still stand against these two readings
> of this proposal and again state that if one interprets it as
> configuration or performance profile, that is a mis-understanding
> [0]. Let me state it clear (again [1]), my proposal is to describe
> hardware thermal limits, because these limits are part of a
> hardware specification; representing in device tree would not
> infringe the original purpose of this data structure ("The Device
> Tree is a data structure for describing hardware."[2]).
>
> Before I explain my proposal, I want to highlight also that these
> data is represented elsewhere already and it is reused across
> different OS's. Thermal data is described using ACPI [3] and
> operating systems ACPI-aware do support the interpretation of
> thermal data. Linux is one example of such systems (I believe I do
> not need to enlist here all systems supporting ACPI). On the other
> hand, not all systems have ACPI or are specified to use ACPI.
> Thus, here is another reason to represent properly thermal data, so
> that we can scale across systems.
>
> In the specific case of Linux, the common thermal concepts between
> ACPI systems and non-ACPI systems have been represented in the
> thermal framework (CONFIG_THERMAL). Today, on ACPI systems, thermal
> data is fetched from bootloader with help from the common ACPI
> parser. For non-ACPI systems, the thermal data is actually coded as
> part of device drivers.
>
> So, to the point, a brief explanation of my proposal goes as
> follows: i - trip points: a node to describe a point in the
> temperature domain in which the system has to take an action. This
> node describes just the point, not the action. Properties here are
> temperature, hysteresis, and type (critical, hot, passive, active,
> etc). ii - binding parameters: the bind_param node is a node to
> describe how actions (cooling devices) get assigned to trip points.
> Cooling devices are expected to be loaded in the target system.
> Properties here are: cooling device name, weight, trip_mask and
> limits. iii - thermal zones: the thermal_zone node is the node
> containing all the required info for describing a thermal zone with
> hardware thermal limitation, including its bindings with cooling
> devices. Properties here are: type, passive_delay, polling_delay,
> governor. The thermal_zone node must contain, apart from its own
> properties, one node containing trip nodes and one node containing
> all the zone bind parameters.
>
> Here is an example (on OMAP4430): thermal_zone { type = "CPU"; mask
> = <0x03>; /* trips writability */ passive_delay = <250>; /*
> milliseconds */ polling_delay = <1000>; /* milliseconds */ governor
> = "step_wise"; trips { alert@100000{ temperature = <100000>; /*
> milliCelsius hysteresis = <2000>; /* milliCelsius */ type =
> <THERMAL_TRIP_PASSIVE>; }; crit@125000{ temperature = <125000>; /*
> milliCelsius hysteresis = <2000>; /* milliCelsius */ type =
> <THERMAL_TRIP_CRITICAL>; }; }; bind_params { action@0{
> cooling_device = "thermal-cpufreq"; weight = <100>; /* percentage
> */ mask = <0x01>; /* no limits, using defaults */ }; }; };
>
> In this current proposal, a 'thermal_zone' node would be embedded
> inside a temperature sensor node, for simplicity. But other
> possible builds could embedded them in the device with thermal
> limits (CPU nodes, for instance) or they could be not embedded in
> any specific node.
>
> A full documented description can be found here [4]. Also a branch
> containing: (a) needed changes in order to have this DT parser; (b)
> the DT parser with documentation (c) examples on how drivers could
> be changes to use the parser can be found in my branch here [5]. I
> wrote the thermal DT parser to build thermal zones with the thermal
> framework API. However, if one does not want to do that, it can
> simple do not include a CONFIG_THERMAL_OF=y in her/his build, and
> the calls will be translated to nops, and the device tree thermal
> data can be parsed to somewhere else interested (other subsystem or
> even user land). A TODO on this implementation is that it still
> lacks the representation of thermal zones composed by several
> sensors. However, I believe it is better to take an incremental
> approach here. This series can already be used to improve most of
> the existing platform thermal drivers (most are CPU thermal
> drivers) and to reuse the existing code of some hwmon sensors to
> build thermal zones for board thermal requirements.
>
> I have already posted a patch series with this proposal on [6],
> that contains a reference for the original RFC. But looks like my
> messages got moderated on device tree mailing list. Obviously,
> within PM forum, feedback was quite positive. However, we cannot
> proceed without proper assessment of other subsystems. lm-sensors
> folks (Guenter) seam to be strongly against this series, as there
> is a fear that this may introduce a mis-usage of DT. I still
> believe this is needed for hardware description, and thus not a
> infringement on DT purposes.
>
> Please let me know your thoughts on this topic and apologize me if
> my previous messages on this topic did not reach you (hope they
> reach now).
>
> All best,
>
> Eduardo Valentin
>
> [0] - https://lkml.org/lkml/2013/7/17/621 [1] -
> https://lkml.org/lkml/2013/7/18/279 [2] - www.devicetree.org [3] -
> http://www.acpi.info/ [4] -
> https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/diff/Documentation/devicetree/bindings/thermal/thermal.txt?h=thermal_work/thermal_core/dt_parser&id=405bf0b51457ed055a082af2653d7ce757bc2e91
>
>
[5] -
> https://git.kernel.org/cgit/linux/kernel/git/evalenti/linux.git/log/?h=thermal_work/thermal_core/dt_parser
>
>
[6] - https://lkml.org/lkml/2013/7/17/923
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/