Re: Linux 3.1-rc9

From: Simon Kirby
Date: Tue Oct 18 2011 - 14:20:59 EST

Next message: Ingo Molnar: "Re: perf tools: interface for improved PEBS ABI can accept wrongparameter"
Previous message: Ingo Molnar: "[GIT PULL] timer fix"
In reply to: Dave Jones: "Re: Linux 3.1-rc9"
Next in thread: Thomas Gleixner: "Re: Linux 3.1-rc9"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Oct 18, 2011 at 11:05:13AM +0200, Peter Zijlstra wrote:

> Subject: cputimer: Cure lock inversion
> From: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> Date: Mon Oct 17 11:50:30 CEST 2011
>
> There's a lock inversion between the cputimer->lock and rq->lock; notably
> the two callchains involved are:
>
> update_rlimit_cpu()
> sighand->siglock
> set_process_cpu_timer()
> cpu_timer_sample_group()
> thread_group_cputimer()
> cputimer->lock
> thread_group_cputime()
> task_sched_runtime()
> ->pi_lock
> rq->lock
>
> scheduler_tick()
> rq->lock
> task_tick_fair()
> update_curr()
> account_group_exec()
> cputimer->lock
>
> Where the first one is enabling a CLOCK_PROCESS_CPUTIME_ID timer, and
> the second one is keeping up-to-date.
>
> This problem was introduced by e8abccb7193 ("posix-cpu-timers: Cure
> SMP accounting oddities").
>
> Cure the problem by removing the cputimer->lock and rq->lock nesting,
> this leaves concurrent enablers doing duplicate work, but the time
> wasted should be on the same order otherwise wasted spinning on the
> lock and the greater-than assignment filter should ensure we preserve
> monotonicity.
>
> Reported-by: Dave Jones <davej@xxxxxxxxxx>
> Reported-by: Simon Kirby <sim@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxx
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> ---
> kernel/posix-cpu-timers.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
> Index: linux-2.6/kernel/posix-cpu-timers.c
> ===================================================================
> --- linux-2.6.orig/kernel/posix-cpu-timers.c
> +++ linux-2.6/kernel/posix-cpu-timers.c
> @@ -274,9 +274,7 @@ void thread_group_cputimer(struct task_s
> struct task_cputime sum;
> unsigned long flags;
>
> - spin_lock_irqsave(&cputimer->lock, flags);
> if (!cputimer->running) {
> - cputimer->running = 1;
> /*
> * The POSIX timer interface allows for absolute time expiry
> * values through the TIMER_ABSTIME flag, therefore we have
> @@ -284,8 +282,11 @@ void thread_group_cputimer(struct task_s
> * it.
> */
> thread_group_cputime(tsk, &sum);
> + spin_lock_irqsave(&cputimer->lock, flags);
> + cputimer->running = 1;
> update_gt_cputime(&cputimer->cputime, &sum);
> - }
> + } else
> + spin_lock_irqsave(&cputimer->lock, flags);
> *times = cputimer->cputime;
> spin_unlock_irqrestore(&cputimer->lock, flags);
> }
>

Tested-by: Simon Kirby <sim@xxxxxxxxxx>

Looks good running on three boxes since this morning (unpatched kernel
hangs in ~15 minutes).

While I have your eyes, does this hang trace make any sense (which
happened a couple of times with your previous patch applied)?

http://0x.ca/sim/ref/3.1-rc9/3.1-rc9-tcp-lockup.log

I don't see how all CPUs could be spinning on the same lock without
reentry, and I don't see the any in the backtraces.

Simon-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Ingo Molnar: "Re: perf tools: interface for improved PEBS ABI can accept wrongparameter"
Previous message: Ingo Molnar: "[GIT PULL] timer fix"
In reply to: Dave Jones: "Re: Linux 3.1-rc9"
Next in thread: Thomas Gleixner: "Re: Linux 3.1-rc9"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]