Re: [PATCH 8/9] hrtimer: Allow hrtimer::function() to free the timer
From: Kirill Tkhai
Date: Fri Jun 05 2015 - 05:28:07 EST
05.06.2015, 12:10, "Peter Zijlstra" <peterz@xxxxxxxxxxxxx>:
> On Fri, Jun 05, 2015 at 12:02:01PM +0300, Kirill Tkhai wrote:
>> Yeah, it's safe for now, but it may happen difficulties with a support
>> in the future, because barrier logic is not easy to review. But it seems
>> we may simplify it a little bit. Please, see the comments below.
>>> @@ -394,8 +399,24 @@ extern u64 hrtimer_get_next_event(void);
>>> */
>>> static inline int hrtimer_active(const struct hrtimer *timer)
>>> {
>>> + struct hrtimer_cpu_base *cpu_base;
>>> + unsigned int seq;
>>> + bool active;
>>> +
>>> + do {
>>> + active = false;
>>> + cpu_base = READ_ONCE(timer->base->cpu_base);
>>> + seqcount_lockdep_reader_access(&cpu_base->seq);
>>> + seq = raw_read_seqcount(&cpu_base->seq);
>>> +
>>> + if (timer->state != HRTIMER_STATE_INACTIVE ||
>>> + cpu_base->running == timer)
>>> + active = true;
>>> +
>>> + } while (read_seqcount_retry(&cpu_base->seq, seq) ||
>>> + cpu_base != READ_ONCE(timer->base->cpu_base));
>>> +
>>> + return active;
>>> }
>> This may race with migrate_hrtimer_list(), so it needs write seqcounter
>> too.
>
> Let me go stare at that.
>> My suggestion is do not use seqcounters for long parts of code, and implement
>> short primitives for changing timer state and cpu_base running timer. Something
>> like this:
>>
>> static inline void hrtimer_set_state(struct hrtimer *timer, unsigned long state)
>> {
>> struct hrtimer_cpu_base *cpu_base = timer->base->cpu_base;
>>
>> lockdep_assert_held(&cpu_base->lock);
>>
>> write_seqcount_begin(&cpu_base->seq);
>> timer->state = state;
>> write_seqcount_end(&cpu_base->seq);
>> }
>>
>> static inline void cpu_base_set_running(struct hrtimer_cpu_base *cpu_base,
>> struct hrtimer *timer)
>> {
>> lockdep_assert_held(&cpu_base->lock);
>>
>> write_seqcount_begin(&cpu_base->seq);
>> cpu_base->running = timer;
>> write_seqcount_end(&cpu_base->seq);
>> }
>>
>> Implemented this, we may less think about right barrier order, because
>> all changes are being made under seqcount.
>
> The problem with that is that it generates far more memory barriers,
> while on x86 these are 'free', this is very much not so for other archs
> like ARM / PPC etc.
Ok, thanks.
One more way is to take write_seqcount every time we're taking base spin_lock,
thus we may group several smp_wmb() in a single. But this increases write
seqlocked code areas, and worsens the speed of hrtimer_active(), so it's not
good too.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/