Re: [PATCH,RFC] perf: panic due to inclied cpu context task_ctxvalue

From: Oleg Nesterov
Date: Wed Mar 30 2011 - 12:37:58 EST


On 03/30, Peter Zijlstra wrote:
>
> --- linux-2.6.orig/kernel/perf_event.c
> +++ linux-2.6/kernel/perf_event.c
> @@ -125,9 +125,25 @@ enum event_type_t {
> * perf_sched_events : >0 events exist
> * perf_cgroup_events: >0 per-cpu cgroup events exist on this cpu
> */
> -atomic_t perf_sched_events __read_mostly;
> +atomic_t perf_sched_events_in __read_mostly;
> +atomic_t perf_sched_events_out __read_mostly;
> static DEFINE_PER_CPU(atomic_t, perf_cgroup_events);
>
> +static void perf_sched_events_inc(void)
> +{
> + jump_label_inc(&perf_sched_events_out);
> + jump_label_inc(&perf_sched_events_in);
> +}
> +
> +static void perf_sched_events_dec(void)
> +{
> + jump_label_dec(&perf_sched_events_in);
> + JUMP_LABEL(&perf_sched_events_in, no_sync);
> + synchronize_sched();
> +no_sync:
> + jump_label_dec(&perf_sched_events_out);
> +}

Nice! I didn't realize we can simply use JUMP_LABEL() directly and then
the code doesn't depend on HAVE_JUMP_LABEL.

Now, the problem is, after I read the comments I am not sure I understand
what synchronize_sched() actually doe. Add Paul.



So. synchronize_sched() above should ensure that all CPUs do context
switch at least once (ignoring idle). And I _thought_ that in practice
this should work.

But, unles I misread the comment above synchronize_sched(), it seems that
it only guarantees the end of "everything" which disables preemption,
explicitly or not. IOW, say, in theory rcu_read_unlock_sched() could
trigger ->passed_quiesc == T without reschedule.

Oh, and this is not theoretical, afaics. run_ksoftirqd() does
rcu_note_context_switch().



So, I think we need something else :/

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/