Re: [PATCH v3 2/2] thermal/debugfs: Add thermal debugfs information for mitigation episodes

From: Daniel Lezcano
Date: Sat Dec 23 2023 - 06:42:03 EST



Hi Rafael,

On 21/12/2023 20:26, Rafael J. Wysocki wrote:

[ ... ]


+/**
+ * struct tz_events - Store all events related to a mitigation episode
+ *
+ * The tz_events structure describes a mitigation episode.

So why not call it tz_mitigation?

A mitigation episode = N x tz_events

eg.
trip A = passive cooling - cpufreq cluster0
trip B = passive cooling - cpufreq cluster0 + cluster1
trip C = active cooling + fan

temperature trip A < trip B < trip C

The mitigation episode, as defined, begins at trip A, and we can have multiple events (eg. trip B crossed several times, trip C, then trip B again etc ...).

[ ... ]

+ if (dfs->tz.trip_index < 0) {
+ tze = thermal_debugfs_tz_event_alloc(tz, now);
+ if (!tze)
+ return;
+
+ list_add(&tze->node, &dfs->tz.tz_events);
+ }
+
+ dfs->tz.trip_index++;
+ dfs->tz.trips_crossed[dfs->tz.trip_index] = trip_id;

So trip_index is an index into trips_crossed[] and the value is the ID
of the trip passed by thermal_debug_tz_trip_up() IIUC, so the trip IDs
in trips_crossed[] are always sorted by the trip temperature, in the
ascending order.

It would be good to write this down somewhere in a comment.

And what if trip temperatures change during a mitigation episode such
that the order by the trip temperature changes?

Changing a trip point temperature during a mitigation is a general question about the thermal framework.

How the governors will behave with such a change on the fly while they are in action?

IMO, we should prevent to change a trip point temperature when this one is crossed and has a cooling device bound to it.

+
+ tze = list_first_entry(&dfs->tz.tz_events, struct tz_events, node);
+ tze->trip_stats[trip_id].timestamp = now;
+ tze->trip_stats[trip_id].max = max(tze->trip_stats[trip_id].max, temperature);
+ tze->trip_stats[trip_id].min = min(tze->trip_stats[trip_id].min, temperature);
+ tze->trip_stats[trip_id].avg = tze->trip_stats[trip_id].avg +
+ (temperature - tze->trip_stats[trip_id].avg) /
+ tze->trip_stats[trip_id].count;
+
+ mutex_unlock(&dfs->lock);
+}



--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog