Re: [PATCH 1/2] thermal: qcom: tsens: atomic temperature read with hardware-guided retries
From: Priyansh Jain
Date: Tue May 05 2026 - 02:11:25 EST
On 04-05-2026 10:59 pm, Daniel Lezcano wrote:
On 4/30/26 07:44, Priyansh Jain wrote:Thanks for pointing this out — yes, this approach looks better.
The existing TSENS temperature read logic polls the valid bit and then
reads the temperature register. When temperature reads are triggered
at very short intervals, this can race with hardware updates and allow
the temperature field to be read while it is still being updated.
In this case, the valid bit may already be asserted even though the
temperature value is transitioning, resulting in an incorrect reading.
Hardware programming guidelines require the temperature value and the
valid bit to be sampled atomically in the same read transaction. A
reading is considered valid only if the valid bit is observed set in
that same sample.
The guidelines further specify that software should attempt the
temperature read up to three times to account for transient update
windows. If none of the attempts observe a valid sample, a stable
fallback value must be returned: if the first and second samples match,
the second value is returned; otherwise, if the second and third
samples match, the third value is returned.
Update the TSENS sensor read logic to implement atomic sampling along
with the recommended retry-and-compare fallback behavior. This removes
the race window and ensures deterministic temperature values in
accordance with hardware requirements.
Signed-off-by: Priyansh Jain <priyansh.jain@xxxxxxxxxxxxxxxx>
---
drivers/thermal/qcom/tsens-v1.c | 6 +-
drivers/thermal/qcom/tsens-v2.c | 6 +-
drivers/thermal/qcom/tsens.c | 118 +++++++++++++++++++++-----------
drivers/thermal/qcom/tsens.h | 22 ++----
4 files changed, 91 insertions(+), 61 deletions(-)
diff --git a/drivers/thermal/qcom/tsens-v1.c b/drivers/thermal/qcom/ tsens-v1.c
index faa5d00788ca..2e0a01348c48 100644
--- a/drivers/thermal/qcom/tsens-v1.c
+++ b/drivers/thermal/qcom/tsens-v1.c
@@ -77,6 +77,9 @@ static struct tsens_features tsens_v1_feat = {
.max_sensors = 11,
.trip_min_temp = -40000,
.trip_max_temp = 120000,
+ .valid_bit = BIT(14),
+ .last_temp_mask = 0x3FF,
This is GENMASK(9, 0)
+ .last_temp_resolution = 9,
Please comply with the SSOT, in the init function compute the mask with:
->last_temp_mask = GENMASK(9, 0);
and remove the initialization here
If I understand correctly, you’re suggesting that the mask should simply be defined in the init function as follows:
priv->feat->last_temp_mask = GENMASK(priv->feat->last_temp_resolution, 0);
?
ACK
};
static struct tsens_features tsens_v1_no_rpm_feat = {
@@ -132,8 +135,7 @@ static const struct reg_field tsens_v1_regfields[MAX_REGFIELDS] = {
/* NO CRITICAL INTERRUPT SUPPORT on v1 */
/* Sn_STATUS */
- REG_FIELD_FOR_EACH_SENSOR11(LAST_TEMP, TM_Sn_STATUS_OFF, 0, 9),
- REG_FIELD_FOR_EACH_SENSOR11(VALID, TM_Sn_STATUS_OFF, 14, 14),
+ REG_FIELD_FOR_EACH_SENSOR11(LAST_TEMP, TM_Sn_STATUS_OFF, 0, 14),
/* xxx_STATUS bits: 1 == threshold violated */
REG_FIELD_FOR_EACH_SENSOR11(MIN_STATUS, TM_Sn_STATUS_OFF, 10, 10),
REG_FIELD_FOR_EACH_SENSOR11(LOWER_STATUS, TM_Sn_STATUS_OFF, 11, 11),
diff --git a/drivers/thermal/qcom/tsens-v2.c b/drivers/thermal/qcom/ tsens-v2.c
index 8d9698ea3ec4..814147735ba5 100644
--- a/drivers/thermal/qcom/tsens-v2.c
+++ b/drivers/thermal/qcom/tsens-v2.c
@@ -56,6 +56,9 @@ static struct tsens_features tsens_v2_feat = {
.max_sensors = 16,
.trip_min_temp = -40000,
.trip_max_temp = 120000,
+ .valid_bit = BIT(21),
+ .last_temp_mask = 0xFFF,
+ .last_temp_resolution = 11,
Ditto
I did not introduce this new function; it was already present and I only moved it from the bottom of the file to the top since it was being used in tsens_read_temp().
};
static struct tsens_features ipq8074_feat = {
@@ -125,8 +128,7 @@ static const struct reg_field tsens_v2_regfields[MAX_REGFIELDS] = {
[WDOG_BARK_COUNT] = REG_FIELD(TM_WDOG_LOG_OFF, 0, 7),
/* Sn_STATUS */
- REG_FIELD_FOR_EACH_SENSOR16(LAST_TEMP, TM_Sn_STATUS_OFF, 0, 11),
- REG_FIELD_FOR_EACH_SENSOR16(VALID, TM_Sn_STATUS_OFF, 21, 21),
+ REG_FIELD_FOR_EACH_SENSOR16(LAST_TEMP, TM_Sn_STATUS_OFF, 0, 21),
/* xxx_STATUS bits: 1 == threshold violated */
REG_FIELD_FOR_EACH_SENSOR16(MIN_STATUS, TM_Sn_STATUS_OFF, 16, 16),
REG_FIELD_FOR_EACH_SENSOR16(LOWER_STATUS, TM_Sn_STATUS_OFF, 17, 17),
diff --git a/drivers/thermal/qcom/tsens.c b/drivers/thermal/qcom/tsens.c
index a2422ebee816..15392a17ef41 100644
--- a/drivers/thermal/qcom/tsens.c
+++ b/drivers/thermal/qcom/tsens.c
@@ -315,10 +315,66 @@ static inline int code_to_degc(u32 adc_code, const struct tsens_sensor *s)
return degc;
}
+static inline enum tsens_ver tsens_version(struct tsens_priv *priv)
+{
+ return priv->feat->ver_major;
+}
I agree putting accessor functions is a good practice but here as it results in duplicating the function, the benefit is discutable.
However, this change is no longer required as I am removing the use of tsens_version() in tsens_read_temp(). As discussed earlier with Konrad, it makes more sense to check for valid‑bit support rather than relying on the TSENS version check in tsens_read_temp().
ACK+/**
+ * tsens_read_temp - To read temperature from hw in deciCelsius.
+ * @s: Pointer to sensor struct
+ * @field: Index into regmap_field array pointing to temperature data
+ * @temp: temperature in deciCelsius to be read from hardware
+ *
+ * This function handles temperature returned in ADC code or deciCelsius
+ * depending on IP version.
+ *
+ * Return: 0 on success, a negative errno will be returned in error cases
+ */
+static int tsens_read_temp(const struct tsens_sensor *s, int field, int *temp)
+{
+ struct tsens_priv *priv = s->priv;
+ int temp_val[3] = {0};
+ unsigned int status = 0;
+ int ret = 0, i;
+ int max_retry = 3;
Please avoid litterals. Add a macro for max number of retries. As the value 3 is not an arbitrary value but a documented value, add a small comment to tell it is a hardware requirement.
Yes i am removing version check, instead adding valid bit check as+ ret = regmap_field_read(priv->rf[field], &status);
+ if (ret)
+ return ret;
+
+ /* VER_0 doesn't have VALID bit */
+ if (tsens_version(priv) == VER_0) {
+ *temp = status;
+ return ret;
+ }
Please use a callback for v0 and v1. Set it at probe time, so the version does not have to be checked at very read.
discussed with Konrad earlier.
+ for (i = 0; i < max_retry; i++) {
+ temp_val[i] = status & priv->feat->last_temp_mask;
+ if (() {
+ *temp = temp_val[i];
+ return ret;
+ }
+ ret = regmap_field_read(priv->rf[field], &status);
+ if (ret)
+ return ret;
It looks like more than max_retry is happening. One time before the loop, then 3 times in loop. So 4 times in total.
Thanks for pointing this out, Yes correct read will happen 4 times will
update the logic.
This approach has some misalignment with the HW recommendations.
+ }
+
+ if (temp_val[0] == temp_val[1])
+ *temp = temp_val[1];
+ else if (temp_val[1] == temp_val[2])
+ *temp = temp_val[2];
+ else
+ return -EAGAIN;
We have a, b and c.
if a == b, then return b
else b == c, then return c
else return -EAGAIN
It is like we have two consecutives successful read. IMO that could be simplified to:
int prev = INTMAX;
/*
* An explanation ...
*/
for (i = 0; i < max_retry; i++) {
int value, valid;
ret = regmap_field_read(priv->rf[field], &status);
if (ret)
return ret;
value = FIELD_GET(priv->feat->last_temp_mask, status);
valid = FIELD_GET(priv->feat->valid_bit, status)
if (valid)
return value;
if (value == prev)
return value;
prev = value;
}
return -EAGAIN;
(Not tested)
As per the HW guidelines, 3 back‑to‑back reads must be performed until a valid read is observed.
b or c should be returned only if none of the three reads(a,b,c) report the valid bit not set.
If a == b, return b
Else if b == c, return c
Else return -EAGAIN
Regards,
Priyansh