Re: [PATCH] thermal/core: Introduce user trip points
From: Rafael J. Wysocki
Date: Tue Jul 02 2024 - 07:03:24 EST
On Tue, Jul 2, 2024 at 12:56 PM Daniel Lezcano
<daniel.lezcano@xxxxxxxxxx> wrote:
> On 02/07/2024 12:22, Rafael J. Wysocki wrote:
> > On Tue, Jul 2, 2024 at 11:29 AM Daniel Lezcano
> > <daniel.lezcano@xxxxxxxxxx> wrote:
> >>
> >> On 01/07/2024 18:26, Rob Herring wrote:
> >>> On Thu, Jun 27, 2024 at 10:54:50AM +0200, Daniel Lezcano wrote:
> >>>> Currently the thermal framework has 4 trip point types:
> >>>>
> >>>> - active : basically for fans (or anything requiring energy to cool
> >>>> down)
> >>>>
> >>>> - passive : a performance limiter
> >>>>
> >>>> - hot : for a last action before reaching critical
> >>>>
> >>>> - critical : a without return threshold leading to a system shutdown
> >>>>
> >>>> A thermal zone monitors the temperature regarding these trip
> >>>> points. The old way to do that is actively polling the temperature
> >>>> which is very bad for embedded systems, especially mobile and it is
> >>>> even worse today as we can have more than fifty thermal zones. The
> >>>> modern way is to rely on the driver to send an interrupt when the trip
> >>>> points are crossed, so the system can sleep while the temperature
> >>>> monitoring is offloaded to a dedicated hardware.
> >>>>
> >>>> However, the thermal aspect is also managed from userspace to protect
> >>>> the user, especially tracking down the skin temperature sensor. The
> >>>> logic is more complex than what we found in the kernel because it
> >>>> needs multiple sources indicating the thermal situation of the entire
> >>>> system.
> >>>>
> >>>> For this reason it needs to setup trip points at different levels in
> >>>> order to get informed about what is going on with some thermal zones
> >>>> when running some specific application.
> >>>>
> >>>> For instance, the skin temperature must be limited to 43°C on a long
> >>>> run but can go to 48°C for 10 minutes, or 60°C for 1 minute.
> >>>>
> >>>> The thermal engine must then rely on trip points to monitor those
> >>>> temperatures. Unfortunately, today there is only 'active' and
> >>>> 'passive' trip points which has a specific meaning for the kernel, not
> >>>> the userspace. That leads to hacks in different platforms for mobile
> >>>> and embedded systems where 'active' trip points are used to send
> >>>> notification to the userspace. This is obviously not right because
> >>>> these trip are handled by the kernel.
> >>>>
> >>>> This patch introduces the 'user' trip point type where its semantic is
> >>>> simple: do nothing at the kernel level, just send a notification to
> >>>> the user space.
> >>>
> >>> Sounds like OS behavior/policy though I guess the existing ones kind are
> >>> too. Maybe we should have defined *what* action to take and then the OS
> >>> could decide whether what actions to handle vs. pass it up a level.
> >>
> >> Right
> >>
> >>> Why can't userspace just ask to be notified at a trip point it
> >>> defines?
> >>
> >> Yes I think it is possible to create a netlink message to create a trip
> >> point which will return a trip id.
> >>
> >> Rafael what do you think ?
> >
> > Trips cannot be created on the fly ATM.
> >
> > What can be done is to create trips that are invalid to start with and
> > then set their temperature via sysfs. This has been done already for
> > quite a while AFAICS.
> Yes, I remember that.
> I would like to avoid introducing more weirdness in the thermal
> framework which deserve a clear ABI.
> What is missing to create new trip points on the fly ?
A different data structure to store them (essentially, a list instead
of an array).
I doubt it's worth the hassle.
What's wrong with the current approach mentioned above? It will need
to be supported going forward anyway.