Re: [PATCH] tick/nohz: Reduce the critical region for jiffies_seq

From: Yunfeng Ye
Date: Mon Nov 16 2020 - 08:24:37 EST




On 2020/11/16 19:29, Thomas Gleixner wrote:
> On Mon, Nov 16 2020 at 14:07, Yunfeng Ye wrote:
>> On 2020/11/16 3:43, Thomas Gleixner wrote:
>>>> and the conflict between jiffies_lock and jiffies_seq increases,
>>>> especially in multi-core scenarios.
>>>
>>> This does not make sense. The sequence counter is updated when holding
>>> the lock, so there is no conflict between the lock and the sequence
>>> count.
>>>
>> Yes, there is no conflict between the lock and the sequence count, but
>> when tick_do_update_jiffies64() is called one by one, the sequence count
>> will be updated, it will affect the latency of tick_nohz_next_event(),
>> because the priority of read seqcount is less than writer.
>
> It's clear to me because I know how that code works, but for someone who
> is not familiar with it your description of the problem is confusing at
> best.
>
>> During a tick period,
>
> 'During a tick period' is misleading. The tick period is the reciprocal
> value ot the tick frequency.
>
> What you want to explain is that if jiffies require an update, i.e.
>
> now > last_update + period
>
> then multiple CPUs might contend on it until last_update is forwarded
> and the quick check is preventing it again.
>
Yes, your are right, thanks.

>> the tick_do_update_jiffies64() is called concurrency, and the
>> time is up to 30+us. so the lockless quick check in tick_do_update_jiffies64()
>> cannot intercept all concurrency.
>
> It cannot catch all of it, right.
>
> There are quite some other inefficiencies in that code and the seqcount
> held time can be reduced way further. Let me stare at it.
>
I think the write seqcount only protecting the last_jiffies_update/jiffies_64/
tick_next_period is enough. The modification which has not been tested,
look like this:

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index f0199a4ba1ad..d5f9930e6bc7 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -66,15 +66,13 @@ static void tick_do_update_jiffies64(ktime_t now)

/* Reevaluate with jiffies_lock held */
raw_spin_lock(&jiffies_lock);
- write_seqcount_begin(&jiffies_seq);

delta = ktime_sub(now, last_jiffies_update);
if (delta >= tick_period) {
+ ktime_t tmp_jiffies_update =
+ ktime_add(last_jiffies_update, tick_period);

delta = ktime_sub(delta, tick_period);
- /* Pairs with the lockless read in this function. */
- WRITE_ONCE(last_jiffies_update,
- ktime_add(last_jiffies_update, tick_period));

/* Slow path for long timeouts */
if (unlikely(delta >= tick_period)) {
@@ -82,21 +80,25 @@ static void tick_do_update_jiffies64(ktime_t now)

ticks = ktime_divns(delta, incr);

- /* Pairs with the lockless read in this function. */
- WRITE_ONCE(last_jiffies_update,
- ktime_add_ns(last_jiffies_update,
- incr * ticks));
+ tmp_jiffies_update =
+ ktime_add_ns(tmp_jiffies_update,
+ incr * ticks);
}
- do_timer(++ticks);
+ ticks++;
+
+ write_seqcount_begin(&jiffies_seq);
+ /* Pairs with the lockless read in this function. */
+ WRITE_ONCE(last_jiffies_update, tmp_jiffies_update);
+ jiffies_64 += ticks;

/* Keep the tick_next_period variable up to date */
tick_next_period = ktime_add(last_jiffies_update, tick_period);
- } else {
write_seqcount_end(&jiffies_seq);
+ calc_global_load();
+ } else {
raw_spin_unlock(&jiffies_lock);
return;
}
- write_seqcount_end(&jiffies_seq);
raw_spin_unlock(&jiffies_lock);
update_wall_time();
}

> Thanks,
>
> tglx
> .
>