Re: 2.6.35-rc3 deadlocks on semaphore operations

From: Luca Tettamanti
Date: Thu Jun 24 2010 - 15:22:11 EST


On Wed, Jun 23, 2010 at 9:14 PM, Luca Tettamanti <kronos.it@xxxxxxxxx> wrote:
> On Wed, Jun 23, 2010 at 6:29 PM, Manfred Spraul
> <manfred@xxxxxxxxxxxxxxxx> wrote:
>> Hi,
>>
>> I think I found it:
>> Previously, queue.status was never IN_WAKEUP when the semaphore spinlock was
>> held.
>>
>> The last patch changes that:
>> Now the change from IN_WAKEUP to the final result code happens after the the
>> semaphore spinlock is dropped.
>> Thus a task can observe IN_WAKEUP even when it acquired the semaphore
>> spinlock.
>>
>> As a result, semop() sometimes returned 1 (IN_WAKEUP) for a successful
>> operation.
>>
>> Attached is a patch that should fix the bug.
>
> Apache seems fine.

Argh, "seems" was indeed appropriate. Manfred your patch does
alleviate the problem but something is still wrong. I noticed (I'm
developing an ajax heavy web app) that sometimes an apache worker
hangs; I can reproduce the problem with ab (apache benchmark) and a
high concurrency level (I'm testing with 100 and 10k requests, and I
get only 2-5 dropped requests). This does not happen with 2.4.34.
Any idea on how I can debug this further?

Luca
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/