Re: [Bug #11989] Suspend failure on NForce4-based boards due to chanes in stop_machine

From: Vegard Nossum
Date: Tue Nov 11 2008 - 08:46:50 EST


On Tue, Nov 11, 2008 at 2:36 PM, Vegard Nossum <vegard.nossum@xxxxxxxxx> wrote:
> On Tue, Nov 11, 2008 at 11:52 AM, Ingo Molnar <mingo@xxxxxxx> wrote:
>> [ Cc:-ed workqueue/locking/suspend-race-condition experts. ]
>
> Heh. I am not expert, but I looked at the code. The obvious suspicious
> thing to see is the use of unpaired barriers? Maybe like this:

...

> 55 /* Last one to ack a state moves to the next state. */
> 56 static void ack_state(void)
> 57 {
> 58 if (atomic_dec_and_test(&thread_ack))
>
> Maybe
> + /* force ordering between thread_ack/state */
> + smp_rmb();
> here?

Oops, I am wrong (after a small investigation).

"1490 Any atomic operation that modifies some state in memory and
returns information
1491 about the state (old or new) implies an SMP-conditional general
memory barrier
1492 (smp_mb()) on each side of the actual operation (with the exception of
1493 explicit lock operations, described later). These include:
1494
...
1503 atomic_dec_and_test();"

Won't fix the problem at hand, but maybe something like this would be
nice for future generations :-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 0e688c6..6796bb1 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -55,6 +55,7 @@ static void set_state(enum stopmachine_state newstate)
/* Last one to ack a state moves to the next state. */
static void ack_state(void)
{
+ /* Implicit memory barrier; no smp_rmb() needed */
if (atomic_dec_and_test(&thread_ack))
set_state(state + 1);
}


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/