On 20/04/2021 16:21, Lukasz Luba wrote:
Hi Daniel,
On 4/20/21 2:30 PM, Daniel Lezcano wrote:
On 19/04/2021 10:45, Lukasz Luba wrote:
[snip]
- instance->cdev->updated = false;
+ if (update)
+ instance->cdev->updated = false;
+
mutex_unlock(&instance->cdev->lock);
- (instance->cdev);
+
+ if (update)
+ thermal_cdev_update(instance->cdev);
This cdev update has something bad IMHO. It is protected by a mutex but
the 'updated' field is left unprotected before calling
thermal_cdev_update().
It is not the fault of this code but how the cooling device are updated
and how it interacts with the thermal instances.
IMO, part of the core code needs to revisited.
I agree, but please check my comments below.
This change tight a bit more the knot.
Would it make sense to you if we create a function eg.
__thermal_cdev_update()
I'm not sure if I assume it right that the function would only have the:
list_for_each_entry(instance, &cdev->thermal_instances, cdev_node)
loop from the thermal_cdev_update(). But if it has only this loop then
it's too little.
And then we have:
void thermal_cdev_update(struct thermal_cooling_device *cdev)
{
mutex_lock(&cdev->lock);
/* cooling device is updated*/
if (cdev->updated) {
mutex_unlock(&cdev->lock);
return;
}
__thermal_cdev_update(cdev);
thermal_cdev_set_cur_state(cdev, target);
Here we are actually setting the 'target' state via:
cdev->ops->set_cur_state(cdev, target)
then we notify, then updating stats.
cdev->updated = true;
mutex_unlock(&cdev->lock);
trace_cdev_update(cdev, target);
Also this trace is something that I'm using in my tests...
Yeah, I noticed right after sending the comments. All that should be
moved in the lockless function.
So this function becomes:
void thermal_cdev_update(struct thermal_cooling_device *cdev)
{
mutex_lock(&cdev->lock);
if (!cdev->updated) {
__thermal_cdev_update(cdev);
cdev->updated = true;
}
mutex_unlock(&cdev->lock);
dev_dbg(&cdev->device, "set to state %lu\n", target);
}
We end up with the trace_cdev_update(cdev, target) inside the mutex
section but that should be fine.
dev_dbg(&cdev->device, "set to state %lu\n", target);
}
And in this file we do instead:
- instance->cdev->updated = false;
+ if (update)
+ __thermal_cdev_update(instance->cdev);
mutex_unlock(&instance->cdev->lock);
- thermal_cdev_update(instance->cdev);
Without the line above, we are not un-throttling the devices.
Is it still true with the amended function thermal_cdev_update() ?