Re: [RFC][PATCH 3/5] timekeeping: Avoid possible deadlock from clock_was_set_delayed

From: John Stultz
Date: Thu Dec 12 2013 - 13:59:46 EST


On 12/12/2013 10:32 AM, Sasha Levin wrote:
> On 12/12/2013 11:34 AM, Sasha Levin wrote:
>> On 12/11/2013 02:11 PM, John Stultz wrote:
>>> As part of normal operaions, the hrtimer subsystem frequently calls
>>> into the timekeeping code, creating a locking order of
>>> hrtimer locks -> timekeeping locks
>>>
>>> clock_was_set_delayed() was suppoed to allow us to avoid deadlocks
>>> between the timekeeping the hrtimer subsystem, so that we could
>>> notify the hrtimer subsytem the time had changed while holding
>>> the timekeeping locks. This was done by scheduling delayed work
>>> that would run later once we were out of the timekeeing code.
>>>
>>> But unfortunately the lock chains are complex enoguh that in
>>> scheduling delayed work, we end up eventually trying to grab
>>> an hrtimer lock.
>>>
>>> Sasha Levin noticed this in testing when the new seqlock lockdep
>>> enablement triggered the following (somewhat abrieviated) message:
>>
>> [snip]
>>
>> This seems to work for me, I don't see the lockdep spew anymore.
>>
>> Tested-by: Sasha Levin <sasha.levin@xxxxxxxxxx>
>
> I think I spoke too soon.
>
> It took way more time to reproduce than previously, but I got:
>
>
> -> #1 (&(&pool->lock)->rlock){-.-...}:
> [ 1195.578519] [<ffffffff81194803>] validate_chain+0x6c3/0x7b0
> [ 1195.578519] [<ffffffff81194d9d>] __lock_acquire+0x4ad/0x580
> [ 1195.578519] [<ffffffff81194ff2>] lock_acquire+0x182/0x1d0
> [ 1195.578519] [<ffffffff843b0760>] _raw_spin_lock+0x40/0x80
> [ 1195.578519] [<ffffffff81153e0e>] __queue_work+0x14e/0x3f0
> [ 1195.578519] [<ffffffff81154168>] queue_work_on+0x98/0x120
> [ 1195.578519] [<ffffffff81161351>]
> clock_was_set_delayed+0x21/0x30
> [ 1195.578519] [<ffffffff811c4b41>] do_adjtimex+0x111/0x160
> [ 1195.578519] [<ffffffff811360e3>] SYSC_adjtimex+0x43/0x80
> [ 1195.578519] [<ffffffff8113612e>] SyS_adjtimex+0xe/0x10
> [ 1195.578519] [<ffffffff843baed0>] tracesys+0xdd/0xe2
> [ 1195.578519]

Are you sure you have that patch applied?

With it we shouldn't be calling clock_was_set_delayed() from do_adjtimex().

thanks
-john




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/