Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling device registered

From: Javi Merino
Date: Wed Oct 14 2015 - 13:07:47 EST


On Mon, Oct 12, 2015 at 09:23:28AM +0000, Chen, Yu C wrote:
> Hi, Javi
> Sorry for my late response,
>
> > -----Original Message-----
> > From: Javi Merino [mailto:javi.merino@xxxxxxx]
> > Sent: Wednesday, September 30, 2015 12:02 AM
> > To: Chen, Yu C
> > Cc: linux-pm@xxxxxxxxxxxxxxx; edubezval@xxxxxxxxx; Zhang, Rui; linux-
> > kernel@xxxxxxxxxxxxxxx; stable@xxxxxxxxxxxxxxx
> > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a cooling
> > device registered
> >
> > Hi Yu,
> >
> > On Mon, Sep 28, 2015 at 06:52:00PM +0100, Chen, Yu C wrote:
> > > Hi, Javi,
> > >
> > > > -----Original Message-----
> > > > From: Javi Merino [mailto:javi.merino@xxxxxxx]
> > > > Sent: Monday, September 28, 2015 10:29 PM
> > > > To: Chen, Yu C
> > > > Cc: linux-pm@xxxxxxxxxxxxxxx; edubezval@xxxxxxxxx; Zhang, Rui;
> > > > linux- kernel@xxxxxxxxxxxxxxx; stable@xxxxxxxxxxxxxxx
> > > > Subject: Re: [PATCH 3/3] Thermal: do thermal zone update after a
> > > > cooling device registered
> > > >
> > > > On Sun, Sep 27, 2015 at 06:48:44AM +0100, Chen Yu wrote:
> > > > > From: Zhang Rui <rui.zhang@xxxxxxxxx>
> > > > >
> > > > >
> > > >
> > > > I think you need to hold cdev->lock here, to make sure that no
> > > > thermal zone is added or removed from cdev->thermal_instances while
> > you are looping.
> > > >
> > > Ah right, will add. If I add the cdev ->lock here, will there be a
> > > AB-BA lock with thermal_zone_unbind_cooling_device?
> >
> > You're right, it could lead to a deadlock. The locks can't be swapped because
> > that won't work in step_wise.
> >
> > The best way that I can think of accessing thermal_instances atomically is by
> > making it RCU protected instead of with mutexes.
> > What do you think?
> >
> RCU would need extra spinlocks to protect the list, and need to sync_rcu after we delete
> one instance from thermal_instance list, I think it is too complicated for me to rewrite: (
> How about using thermal_list_lock instead of cdev ->lock?
> This guy should be big enough to protect the device.thermal_instance list.

thermal_list_lock protects thermal_tz_list and thermal_cdev_list, but
it doesn't protect the thermal_instances list. For example,
thermal_zone_bind_cooling_device() adds a cooling device to the
cdev->thermal_instances list without taking thermal_tz_list.

To sum up, you have to protect accessing the cdev->thermal_instances
list but with the current locking scheme, you would create an AB-BA
deadlock. As I see it you would have to change the locking scheme to
either RCU or add a new mutex that protects the
cdev->thermal_instances and tz->thermal_instances lists and change all
accesses to them to make sure they comply with the new locking scheme.

Is there a better way of solving this? Cheers,
Javi


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/