Re: [RFC][PATCH 1/2] Remove stop_machine from change_clocksource

From: Martin Schwidefsky
Date: Thu Jul 29 2010 - 03:11:41 EST


On Wed, 28 Jul 2010 09:12:49 -0700
john stultz <johnstul@xxxxxxxxxx> wrote:

> On Wed, 2010-07-28 at 09:17 +0200, Martin Schwidefsky wrote:
> > On Tue, 27 Jul 2010 19:06:41 -0700
> > John Stultz <johnstul@xxxxxxxxxx> wrote:
> >
> > If I look at
> > timekeeping_get_ns I don't see anything that prevents the
> > compiler from generating code that reads timekeeper.clock
> > multiple times. Which would mix the read function from one
> > clocksource with the cycle_last / mask values from the new
> > clock. Now if we add code that prevents the compiler from
> > reading from timekeeper.clock multiple times we might get
> > away with it.
>
> Right, but this should be ok. timekeeping_get_ns is a helper that
> requires the xtime_lock to be held (such a comment is probably needed,
> but there is no usage of it when the xtime_lock isn't held). While the
> function may actually mix values from two clocksources in a calculation,
> the results of those calculations will be thrown out and re-done via the
> xtime_lock seqlock.

Ok, all callers to timekeeping_get_ns use an xtime_lock loop to
make sure that no inconsistent result gets returned.

> > The reasoning for stop_machine is that the change of a
> > clocksource is a major change which has subtle side effects
> > so we want to make sure that nothing breaks. It is a very rare
> > event, we can afford to spent a little bit of time there.
> > Ergo stop_machine.
>
> I do agree that there can be subtle side effects when dealing with
> clocksources (part of why I'm being so cautious introducing this
> change), and when the stop_machine code was added it seemed reasonable.
> But given the limitations of stop_machine, the more I look at the
> clocksource_change code, the more I suspect stop_machine is overkill and
> we can safely just take the write lock on xtime_lock.
>
> If I'm still missing something, do let me know.

What about a clocksource_unregister while a cpu is in the middle of a
read_seqbegin/timekeeping_get_ns/read_seqretry? The clocksource structure
is "free" after the successful call to the unregister. At least in theory
this could be a use after free. The race window is tiny but on virtual
systems there can be an arbitrary delay in the ktime_get sequence.

I agree that stop_machine is the big gun and restricts the code in the way
how the clocksource functions may be call. But it is safe, no?

--
blue skies,
Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/