Re: [PATCH v2] blktrace: Fix potentail deadlock between delete & sysfs ops

From: Waiman Long
Date: Fri Aug 18 2017 - 13:22:53 EST


On 08/18/2017 12:21 PM, Bart Van Assche wrote:
> On Fri, 2017-08-18 at 09:55 -0400, Waiman Long wrote:
>> On 08/17/2017 05:30 PM, Steven Rostedt wrote:
>>> On Thu, 17 Aug 2017 17:10:07 -0400
>>> Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>>>> Instead of playing games with taking the lock, the only way this race
>>>> is hit, is if the partition is being deleted and the sysfs attribute is
>>>> being read at the same time, correct? In that case, just return
>>>> -ENODEV, and be done with it.
>>> Nevermind that wont work. Too bad there's not a mutex_lock_timeout()
>>> that we could use in a loop. It would solve the issue of forward
>>> progress with RT tasks, and will break after a timeout in case of
>>> deadlock.
>> I think it will be useful to have mutex_timed_lock(). RT-mutex does have
>> a timed version, so I guess it shouldn't be hard to implement one for
>> mutex. I can take a shot at trying to do that.
> (just caught up with the entire e-mail thread)
>
> Sorry Waiman but personally I thoroughly detest loops around mutex_trylock() or
> mutex_timed_lock() because such loops are usually used to paper over a problem
> instead of fixing the root cause. What I understood from the comment in v1 of your
> patch is that bd_mutex is not only held during block device creation and removal
> but additionally that bd_mutex is obtained inside sysfs attribute callback methods?
> That pattern is guaranteed to lead to deadlocks. Since the block device removal
> code waits until all sysfs callback methods have finished there is no need to
> protect against block device removal inside the sysfs callback methods. My proposal

You are right. We don't really need to take the bd_mutex as the fact
that inside the sysfs callback method will guarantee the block device
won't go away.

> is to split bd_mutex: one global mutex that serializes block device creation and
> removal and one mutex per block device that serializes changes to a single block
> device. Obtaining the global mutex from inside a block device sysfs callback
> function is not safe but obtaining the per-block-device mutex from inside a sysfs
> callback function is safe.
>
> Bart.

The bd_mutex we are talking here is already per block device. I am
thinking about having a global blktrace mutex that is used to serialize
the read and write of blktrace attributes. Since blktrace sysfs files
are not supposed to be frequently accessed, having a global lock
shouldn't cause any problem.

Thanks,
Longman