Re: [PATCH v2] thermal: add sysfs_notify on some attributes

From: Eduardo Valentin
Date: Mon Mar 28 2016 - 21:35:52 EST

On Tue, Mar 15, 2016 at 11:08:00PM +0000, Pandruvada, Srinivas wrote:
> On Mon, 2016-03-14 at 11:12 -0700, Srikar Srimath Tirumala wrote:
> > Add a sysfs_notify on thermal_zone*/temp and cooling_device*/
> > cur_state whenever any trip is triggered or cur state is changed.
> >
> > This change allows usermode apps to register themselves to get
> > notified, when certain thermal conditions occur and reduce their
> > workload. This workload throttling allows usermode to react before
> > hardware clocks are throttled and keep some critical apps running
> > reliably longer.
> I think we need a combination of proposal in 
> and this.
> For example this patch notifies that some trip is violated, but that is
> not enough for user space application to take any action. Some trips
> violations user space may not care as this may be a transient one. The
> patch from Eduardo address that by providing trip, temperature and last
> temperature information. But that patch only address hot trips. I
> understand why Eduardo doesn't want to be notified for passive trips as
> there will be too many.

Yeah, my original intention was to avoid flooding userland, specially
through a pipe like the sysfs netlink, which is in use to deal with
other stuff.

> So IMO we need some mechanism to turn off notification and decide what
> notification will result in user space notifications.
> On some x86 systems we have 10+ passive/active trips, this will results
> in too many notifications. We may be in thermally sensitive zone, where
> more code excecution is more heat.

Exactly, that has direct impact not only on the process waiting for
thermal notifications, but also on other process listening to sysfs.

> We may have some mask of trips for which will result in notifications.
> By default no notifications, unless some user space requests.

The complexity of such communication (or even the current status of
sysfs ABI) starts to reach limit of such channel. We may definitely
consider other means, such as /dev interface, just like IIO does.

> During last LPC we discussed about using IIO for temperature threshold
> notifications and I submitted multiple changes for that. Looks like we
> also care of trip point changes. So I think we need more comprehensive
> mechanism to address this.
> May be we should have thermal mini summit during LPC again and decide a
> comprehensive plan to address all asynchronous thermal notifications.

I have created a wiki for LPC 2016

Overall I believe we need to solve the (temperature) sensing in a more
structured way within the kernel. We have three subsystem that allow
performing temperature sensing. They are different in design and
concept, but still solve similar problems.

I would still prefer to get this better architectured.

Of course, we do not need to wait until LPC to start drafting this.

Again, please lets generate enough quorum to run the micro conf.


Eduardo Valentin