Re: [PATCH 2/2] rcu: Fix up timeouts for forcing the quiescent state
From: Paul E. McKenney
Date: Fri Sep 04 2015 - 19:50:02 EST
On Fri, Sep 04, 2015 at 02:11:30PM +0200, Petr Mladek wrote:
> The deadline to force the quiescent state (jiffies_force_qs) is currently
> updated only when the previous timeout passed. But the timeout used for
> wait_event() is always the entire original timeout. This is strange.
They tell me that kthreads aren't supposed to every catch signals,
hence the WARN_ON() in the early-exit case stray-signal case.
In the case where we were awakened with an explicit force-quiescent-state
request, we do the scan, and then wait the full time for the next scan.
So the point of the delay is to space out the scans, not to fit a
pre-determined schedule.
The reason we get awakened with an explicit force-quiescent-state
request is that a given CPU just got inundated with RCU callbacks
or that rcutorture wants to hammer this code path.
So I am not seeing this as anything in need of fixing.
Am I missing something subtle here?
Thanx, Paul
> First, we might miss the deadline if we wait after a spurious wake up
> or after sleeping in cond_resched() because we wait too long.
>
> Second, we might do another forcing too early if the previous forcing
> was done earlier because of RCU_GP_FLAG_FQS and we later get a spurious
> wake up. IMHO, we should reset the deadline in this case.
>
> This patch updates the deadline "jiffies_force_qs" right after forcing
> the quiescent state by rcu_gp_fqs().
>
> Also it updates the remaining timeout according to the current jiffies and
> the requested deadline.
>
> It moves the cond_resched_rcu_qs() to a single place. It changes the order
> of the check for the pending signal. But there never should be a pending
> signal. If there was we would have bigger problems because wait_event()
> would never sleep again until someone flushed the signal.
>
> I have found these problems when trying to understand the code. I do not
> have any reproducer. I think that it is hardly visible because
> the spurious wakeup is rather theoretical.
>
> Signed-off-by: Petr Mladek <pmladek@xxxxxxxx>
> ---
> kernel/rcu/tree.c | 77 ++++++++++++++++++++++++++++++++++++++-----------------
> 1 file changed, 53 insertions(+), 24 deletions(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 54af8d5f9f7b..aaeeabcba545 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2035,13 +2035,45 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
> }
>
> /*
> + * Normalize, update, and return the first timeout.
> + */
> +static unsigned long normalize_jiffies_till_first_fqs(void)
> +{
> + unsigned long j = jiffies_till_first_fqs;
> +
> + if (unlikely(j > HZ)) {
> + j = HZ;
> + jiffies_till_first_fqs = HZ;
> + }
> +
> + return j;
> +}
> +
> +/*
> + * Normalize, update, and return the first timeout.
> + */
> +static unsigned long normalize_jiffies_till_next_fqs(void)
> +{
> + unsigned long j = jiffies_till_next_fqs;
> +
> + if (unlikely(j > HZ)) {
> + j = HZ;
> + jiffies_till_next_fqs = HZ;
> + } else if (unlikely(j < 1)) {
> + j = 1;
> + jiffies_till_next_fqs = 1;
> + }
> +
> + return j;
> +}
> +
> +/*
> * Body of kthread that handles grace periods.
> */
> static int __noreturn rcu_gp_kthread(void *arg)
> {
> int gf;
> - unsigned long j;
> - int ret;
> + unsigned long timeout, j;
> struct rcu_state *rsp = arg;
> struct rcu_node *rnp = rcu_get_root(rsp);
>
> @@ -2071,22 +2103,18 @@ static int __noreturn rcu_gp_kthread(void *arg)
>
> /* Handle quiescent-state forcing. */
> rsp->fqs_state = RCU_SAVE_DYNTICK;
> - j = jiffies_till_first_fqs;
> - if (j > HZ) {
> - j = HZ;
> - jiffies_till_first_fqs = HZ;
> - }
> - ret = 0;
> + timeout = normalize_jiffies_till_first_fqs();
> + rsp->jiffies_force_qs = jiffies + timeout;
> for (;;) {
> - if (!ret)
> - rsp->jiffies_force_qs = jiffies + j;
> trace_rcu_grace_period(rsp->name,
> READ_ONCE(rsp->gpnum),
> TPS("fqswait"));
> rsp->gp_state = RCU_GP_WAIT_FQS;
> - ret = wait_event_interruptible_timeout(rsp->gp_wq,
> - rcu_gp_fqs_check_wake(rsp, &gf), j);
> + wait_event_interruptible_timeout(rsp->gp_wq,
> + rcu_gp_fqs_check_wake(rsp, &gf),
> + timeout);
> rsp->gp_state = RCU_GP_DOING_FQS;
> +try_again:
> /* Locking provides needed memory barriers. */
> /* If grace period done, leave loop. */
> if (!READ_ONCE(rnp->qsmask) &&
> @@ -2099,28 +2127,29 @@ static int __noreturn rcu_gp_kthread(void *arg)
> READ_ONCE(rsp->gpnum),
> TPS("fqsstart"));
> rcu_gp_fqs(rsp);
> + timeout = normalize_jiffies_till_next_fqs();
> + rsp->jiffies_force_qs = jiffies + timeout;
> trace_rcu_grace_period(rsp->name,
> READ_ONCE(rsp->gpnum),
> TPS("fqsend"));
> - cond_resched_rcu_qs();
> - WRITE_ONCE(rsp->gp_activity, jiffies);
> } else {
> /* Deal with stray signal. */
> - cond_resched_rcu_qs();
> - WRITE_ONCE(rsp->gp_activity, jiffies);
> WARN_ON(signal_pending(current));
> trace_rcu_grace_period(rsp->name,
> READ_ONCE(rsp->gpnum),
> TPS("fqswaitsig"));
> }
> - j = jiffies_till_next_fqs;
> - if (j > HZ) {
> - j = HZ;
> - jiffies_till_next_fqs = HZ;
> - } else if (j < 1) {
> - j = 1;
> - jiffies_till_next_fqs = 1;
> - }
> + cond_resched_rcu_qs();
> + WRITE_ONCE(rsp->gp_activity, jiffies);
> + /*
> + * Count the remaining timeout when it was a spurious
> + * wakeup. Well, it is useful also when we have slept
> + * in the cond_resched().
> + */
> + j = jiffies;
> + if (ULONG_CMP_GE(j, rsp->jiffies_force_qs))
> + goto try_again;
> + timeout = rsp->jiffies_force_qs - j;
> }
>
> /* Handle grace-period end. */
> --
> 1.8.5.6
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/