Re: [PATCH v2] ARM: Don't use complete() during __cpu_die

From: Paul E. McKenney
Date: Mon Mar 23 2015 - 11:37:57 EST

Next message: Daniel Lezcano: "Re: [PATCH V3 7/8] ARM: cpuidle: Register per cpuidle device"
Previous message: Joe Perches: "Re: [PATCH -next] netfilter: Use LOGLEVEL_<FOO> defines"
In reply to: Russell King - ARM Linux: "Re: [PATCH v2] ARM: Don't use complete() during __cpu_die"
Next in thread: Paul E. McKenney: "Re: [PATCH v2] ARM: Don't use complete() during __cpu_die"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Mar 23, 2015 at 02:00:41PM +0000, Russell King - ARM Linux wrote:
> On Mon, Mar 23, 2015 at 06:21:33AM -0700, Paul E. McKenney wrote:
> > On Mon, Mar 23, 2015 at 12:55:45PM +0000, Russell King - ARM Linux wrote:
> > > What if the cpu_hotplug_state for the CPU changes between reading it
> > > testing its value, and then writing it, or do we guarantee that it
> > > can't change other than by this function here? If it can't change,
> > > then what's the point of using atomic_t for this?
> >
> > It indeed cannot change -here-, but there are other situations where
> > more than one CPU can be attempting to change it at the same time.
> > The reason that it cannot change here is that the variable has the
> > value that indicates that the previous offline completed within the
> > timeout period, so there are no outstanding writes.
> >
> > When going offline, the outgoing CPU might take so long leaving that
> > the surviving CPU times out, thus both CPUs write to the variable at
> > the same time, which by itself requires atomics. If this seems
> > unlikely, consider virtualized environments where the hypervisor
> > might preempt the guest OS.
>
> If two CPUs write using atomic_set() to the same location, what value do
> you end up with? It's just the same as if two CPUs write normally to the
> same location.

If that happens when using these functions, that is a usage error.

> The only way it can be different is if you use atomic_cmpxchg(), or
> cmpxchg() so you can atomically test that the original value is what's
> expected prior to writing it.
>
> To illustrate this (I'll abbreviate the per-cpu state):
>
> CPU0 CPU1
> oldstate = atomic_read(state);
> if (oldstate == CPU_DEAD) /* it isn't */
> atomic_cmpxchg(state, oldstate, CPU_BROKEN)
> atomic_xchg(state, CPU_DEAD)
>
> /* succeeds */
> != old_state /* fails */
> ... so doesn't loop
> cpu_wait_death() returns false (failed)
>
> Here, we report that the CPU failed to go offline, but the state is set
> to CPU_DEAD. If it had been slightly earlier, CPU0 would have updated
> it to CPU_DEAD_POST.

Indeed. Which is precisely why cpu_check_up_prepare() returns an error
after that timeout occurs. This allows the arch code to decide what to
do, which might be to reset the CPU in some way. Or to simply error
out the attempt to online the CPU.

> If we instead got rid of this atomic stuff, and looked at the code
> without atomic_* stuff confusing the picture, we'd probably have seen
> that. This is why, whenever I see atomic_read() and atomic_set(), I'm
> very suspicious that the code is correct; virtually every time I've
> seen this pattern, the code has always had some problem. I utterly
> hate code which uses these functions.

They do have their uses, including this case.

> So, I'd suggest getting rid of atomic_read() and atomic_set() here
> converting atomic_cmpxchg() to a plain cmpxchg(), and replacing that
> atomic_xchg() with a similar call to cmpxchg() so that a broken CPU
> doesn't confusingly end up being reported as broken yet getting a
> CPU_DEAD state.

And exactly how would that help? You have all the same issues with this
transformed version of the code. You just have READ_ONCE() instead of
atomic_read() and WRITE_ONCE() instead of atomic_set(). Or you have
bare accesses that the compiler might well be able to optimize into
nonsensical code. Again, how does this help?

> After all, we know that a CPU going down _should_ be online before it
> goes down - if it isn't online, how can it be executing code. :)

That is indeed one scenario.

Another scenario is that the CPU failed to go down promptly during a
previous offline operation, and the arch-specific code chose to ignore
that error when the CPU was next brought online. In this case, at the
next offline, this code would se CPU_DEAD_FROZEN or CPU_BROKEN instead
of CPU_ONLINE. And there are architectures that do just this.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Daniel Lezcano: "Re: [PATCH V3 7/8] ARM: cpuidle: Register per cpuidle device"
Previous message: Joe Perches: "Re: [PATCH -next] netfilter: Use LOGLEVEL_<FOO> defines"
In reply to: Russell King - ARM Linux: "Re: [PATCH v2] ARM: Don't use complete() during __cpu_die"
Next in thread: Paul E. McKenney: "Re: [PATCH v2] ARM: Don't use complete() during __cpu_die"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]