Re: [PATCH 1/2] thermal: qcom: tsens: atomic temperature read with hardware-guided retries
From: Daniel Lezcano
Date: Mon May 04 2026 - 13:29:48 EST
On 4/30/26 07:44, Priyansh Jain wrote:
The existing TSENS temperature read logic polls the valid bit and then
reads the temperature register. When temperature reads are triggered
at very short intervals, this can race with hardware updates and allow
the temperature field to be read while it is still being updated.
In this case, the valid bit may already be asserted even though the
temperature value is transitioning, resulting in an incorrect reading.
Hardware programming guidelines require the temperature value and the
valid bit to be sampled atomically in the same read transaction. A
reading is considered valid only if the valid bit is observed set in
that same sample.
The guidelines further specify that software should attempt the
temperature read up to three times to account for transient update
windows. If none of the attempts observe a valid sample, a stable
fallback value must be returned: if the first and second samples match,
the second value is returned; otherwise, if the second and third
samples match, the third value is returned.
Update the TSENS sensor read logic to implement atomic sampling along
with the recommended retry-and-compare fallback behavior. This removes
the race window and ensures deterministic temperature values in
accordance with hardware requirements.
Signed-off-by: Priyansh Jain <priyansh.jain@xxxxxxxxxxxxxxxx>
---
drivers/thermal/qcom/tsens-v1.c | 6 +-
drivers/thermal/qcom/tsens-v2.c | 6 +-
drivers/thermal/qcom/tsens.c | 118 +++++++++++++++++++++-----------
drivers/thermal/qcom/tsens.h | 22 ++----
4 files changed, 91 insertions(+), 61 deletions(-)
diff --git a/drivers/thermal/qcom/tsens-v1.c b/drivers/thermal/qcom/tsens-v1.c
index faa5d00788ca..2e0a01348c48 100644
--- a/drivers/thermal/qcom/tsens-v1.c
+++ b/drivers/thermal/qcom/tsens-v1.c
@@ -77,6 +77,9 @@ static struct tsens_features tsens_v1_feat = {
.max_sensors = 11,
.trip_min_temp = -40000,
.trip_max_temp = 120000,
+ .valid_bit = BIT(14),
+ .last_temp_mask = 0x3FF,
This is GENMASK(9, 0)
+ .last_temp_resolution = 9,
Please comply with the SSOT, in the init function compute the mask with:
->last_temp_mask = GENMASK(9, 0);
and remove the initialization here
};
static struct tsens_features tsens_v1_no_rpm_feat = {
@@ -132,8 +135,7 @@ static const struct reg_field tsens_v1_regfields[MAX_REGFIELDS] = {
/* NO CRITICAL INTERRUPT SUPPORT on v1 */
/* Sn_STATUS */
- REG_FIELD_FOR_EACH_SENSOR11(LAST_TEMP, TM_Sn_STATUS_OFF, 0, 9),
- REG_FIELD_FOR_EACH_SENSOR11(VALID, TM_Sn_STATUS_OFF, 14, 14),
+ REG_FIELD_FOR_EACH_SENSOR11(LAST_TEMP, TM_Sn_STATUS_OFF, 0, 14),
/* xxx_STATUS bits: 1 == threshold violated */
REG_FIELD_FOR_EACH_SENSOR11(MIN_STATUS, TM_Sn_STATUS_OFF, 10, 10),
REG_FIELD_FOR_EACH_SENSOR11(LOWER_STATUS, TM_Sn_STATUS_OFF, 11, 11),
diff --git a/drivers/thermal/qcom/tsens-v2.c b/drivers/thermal/qcom/tsens-v2.c
index 8d9698ea3ec4..814147735ba5 100644
--- a/drivers/thermal/qcom/tsens-v2.c
+++ b/drivers/thermal/qcom/tsens-v2.c
@@ -56,6 +56,9 @@ static struct tsens_features tsens_v2_feat = {
.max_sensors = 16,
.trip_min_temp = -40000,
.trip_max_temp = 120000,
+ .valid_bit = BIT(21),
+ .last_temp_mask = 0xFFF,
+ .last_temp_resolution = 11,
Ditto
};
static struct tsens_features ipq8074_feat = {
@@ -125,8 +128,7 @@ static const struct reg_field tsens_v2_regfields[MAX_REGFIELDS] = {
[WDOG_BARK_COUNT] = REG_FIELD(TM_WDOG_LOG_OFF, 0, 7),
/* Sn_STATUS */
- REG_FIELD_FOR_EACH_SENSOR16(LAST_TEMP, TM_Sn_STATUS_OFF, 0, 11),
- REG_FIELD_FOR_EACH_SENSOR16(VALID, TM_Sn_STATUS_OFF, 21, 21),
+ REG_FIELD_FOR_EACH_SENSOR16(LAST_TEMP, TM_Sn_STATUS_OFF, 0, 21),
/* xxx_STATUS bits: 1 == threshold violated */
REG_FIELD_FOR_EACH_SENSOR16(MIN_STATUS, TM_Sn_STATUS_OFF, 16, 16),
REG_FIELD_FOR_EACH_SENSOR16(LOWER_STATUS, TM_Sn_STATUS_OFF, 17, 17),
diff --git a/drivers/thermal/qcom/tsens.c b/drivers/thermal/qcom/tsens.c
index a2422ebee816..15392a17ef41 100644
--- a/drivers/thermal/qcom/tsens.c
+++ b/drivers/thermal/qcom/tsens.c
@@ -315,10 +315,66 @@ static inline int code_to_degc(u32 adc_code, const struct tsens_sensor *s)
return degc;
}
+static inline enum tsens_ver tsens_version(struct tsens_priv *priv)
+{
+ return priv->feat->ver_major;
+}
I agree putting accessor functions is a good practice but here as it results in duplicating the function, the benefit is discutable.
+/**
+ * tsens_read_temp - To read temperature from hw in deciCelsius.
+ * @s: Pointer to sensor struct
+ * @field: Index into regmap_field array pointing to temperature data
+ * @temp: temperature in deciCelsius to be read from hardware
+ *
+ * This function handles temperature returned in ADC code or deciCelsius
+ * depending on IP version.
+ *
+ * Return: 0 on success, a negative errno will be returned in error cases
+ */
+static int tsens_read_temp(const struct tsens_sensor *s, int field, int *temp)
+{
+ struct tsens_priv *priv = s->priv;
+ int temp_val[3] = {0};
+ unsigned int status = 0;
+ int ret = 0, i;
+ int max_retry = 3;
Please avoid litterals. Add a macro for max number of retries. As the value 3 is not an arbitrary value but a documented value, add a small comment to tell it is a hardware requirement.
+ ret = regmap_field_read(priv->rf[field], &status);
+ if (ret)
+ return ret;
+
+ /* VER_0 doesn't have VALID bit */
+ if (tsens_version(priv) == VER_0) {
+ *temp = status;
+ return ret;
+ }
Please use a callback for v0 and v1. Set it at probe time, so the version does not have to be checked at very read.
+ for (i = 0; i < max_retry; i++) {
+ temp_val[i] = status & priv->feat->last_temp_mask;
+ if (() {
+ *temp = temp_val[i];
+ return ret;
+ }
+ ret = regmap_field_read(priv->rf[field], &status);
+ if (ret)
+ return ret;
It looks like more than max_retry is happening. One time before the loop, then 3 times in loop. So 4 times in total.
+ }
+
+ if (temp_val[0] == temp_val[1])
+ *temp = temp_val[1];
+ else if (temp_val[1] == temp_val[2])
+ *temp = temp_val[2];
+ else
+ return -EAGAIN;
We have a, b and c.
if a == b, then return b
else b == c, then return c
else return -EAGAIN
It is like we have two consecutives successful read. IMO that could be simplified to:
int prev = INTMAX;
/*
* An explanation ...
*/
for (i = 0; i < max_retry; i++) {
int value, valid;
ret = regmap_field_read(priv->rf[field], &status);
if (ret)
return ret;
value = FIELD_GET(priv->feat->last_temp_mask, status);
valid = FIELD_GET(priv->feat->valid_bit, status)
if (valid)
return value;
if (value == prev)
return value;
prev = value;
}
return -EAGAIN;
(Not tested)