Re: [PATCH] w1: ds2780, fix potential deadlock on insertion andremoval

From: Andrew Morton
Date: Tue Aug 16 2011 - 19:26:16 EST


On Fri, 12 Aug 2011 14:50:33 -0400
Clifton Barnes <cabarnes@xxxxxxxxxxxxxxxx> wrote:

> Simon Inizan found a synchronization problem with the w1 interface locking the
> mutex during insertion and removal, but the power supply interface trying to
> get POWER_SUPPLY_PROP_STATUS upon insertion and removal, which causes a 1-wire
> transaction that tries to lock the mutex again. The following patch has been
> tested to fix the problem. It is not a very elegant solution with using a
> variable to store the lock status, so if anyone has a better idea please
> present it.

Changing the type of the first arg to ds2780_read8() and friends
created a lot of patch noise - it would have been nice to separate that
out into a second patch. Not a major issue though.

> ---
> drivers/power/ds2780_battery.c | 86 +++++++++++++++++++++++----------------
> drivers/w1/slaves/w1_ds2780.c | 1 -
> 2 files changed, 51 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/power/ds2780_battery.c b/drivers/power/ds2780_battery.c
> index 1fefe82..2be668d 100644
> --- a/drivers/power/ds2780_battery.c
> +++ b/drivers/power/ds2780_battery.c
> @@ -39,6 +39,7 @@ struct ds2780_device_info {
> struct device *dev;
> struct power_supply bat;
> struct device *w1_dev;
> + int lock_held;
> };
>
> enum current_types {
> @@ -49,8 +50,8 @@ enum current_types {
> static const char model[] = "DS2780";
> static const char manufacturer[] = "Maxim/Dallas";
>
> -static inline struct ds2780_device_info *to_ds2780_device_info(
> - struct power_supply *psy)
> +static inline struct ds2780_device_info *
> +to_ds2780_device_info(struct power_supply *psy)
> {
> return container_of(psy, struct ds2780_device_info, bat);
> }
> @@ -60,17 +61,28 @@ static inline struct power_supply *to_power_supply(struct device *dev)
> return dev_get_drvdata(dev);
> }
>
> -static inline int ds2780_read8(struct device *dev, u8 *val, int addr)
> +static inline int ds2780_battery_io(struct ds2780_device_info *dev_info,
> + char *buf, int addr, size_t count, int io)
> {
> - return w1_ds2780_io(dev, val, addr, sizeof(u8), 0);
> + if (dev_info->lock_held)
> + return count;
> + else
> + return w1_ds2780_io(dev_info->w1_dev, buf, addr, count, io);
> +}

I think this is just not correct.

a) We only need to avoid the mutex_lock() if *this thread* already
holds the lock. But testing the flag in this manner causes the code
to avoid taking the lock if some other thread set lock_held. But
what we should have done in this case was to wait, by calling
mutex_lock().

b) If the lock was held, the function simply bales out, returning
incorrect data for a read() and doing nothing for a write().


A way to fix all this (still ugly though) would be to replace lock_held
with a task_struct* which points at the task which currently holds the
mutex, and is NULL if no task holds the mutex. Then we do

if (dev_info->mutex_holder == current)
w1_ds2780_io_nolock(...);
else
w1_ds2780_io(...);

Where w1_ds2780_io_nolock() is the guts of the current w1_ds2780_io(),
without the mutex_lock/unlock.

But it would be better to fix things properly :(

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/