Re:Re: PROBLEM:a bug about pi-futex maybe let the program going tohang

From: xby
Date: Mon Mar 28 2011 - 06:01:57 EST



At 2011-03-28 16:26:22,"Peter Zijlstra" <peterz@xxxxxxxxxxxxx> wrote:

>On Mon, 2011-03-28 at 15:25 +0800, xby wrote:
>> hi, all.
>
>Works better if you also CC people who actually work on that code.
>
>> Maybe, there is a bug about pi-futex, it would let the program in user-space going to hang.
>>
>> We have a board: CPU is powerpc 8572, two core. after ran one month, the state of pi-futex in user-space got bad: mutex->__data.__lock is 0x8000023e, mutex->__data.__count is 0, mutex->__data.__owner is 0.
>>
>> then, I review file "kernel/funtex.c"(the version is linux 2.6.38), found a case:
>>
>> if there are 3 thread, named threadA, threadB, threadC。thread A hold mutexM, threadB and threadC is waiting mutexM. They run as fllow steps:
>>
>> 1. threadB and threadC sleep at line 1984.
>> 2. threadB receive a signal, then it will be wake up.
>> 3. threadA unlock mutexM, and give mutexM to threadB.
>> 4. threadB call fixup_owner, try to give mutex to threadC.
>> 5. at line 1580, threadB trigger a addr-fault, then goto handle_fault.
>> 6. at line 1617, threadB release spinlock, then handle fault.
>> 7. threadC got spinlock, and call fixup_owner, and got mutexM.
>> 8. threadC give mutexM to threadB.
>> 9. threadB re-got spinlock, it will found "pi_state->owner == oldowner" and retry to fixup.
>> 10. threadB give mutexM to threadC, that's a bad thing.
>>
>> we have wrote a program, this program can prove all above.
>
>It would have been ever so much more useful if you'd have included that.

sorry, the code lies at office, and can't mail to all. I'm at home now ^-^

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/