Re: [PATCH v3 02/12] ARM: edma: Don't clear EMR of channel in edma_stop

From: Sekhar Nori
Date: Mon Aug 12 2013 - 02:25:32 EST


On Monday 12 August 2013 09:59 AM, Joel Fernandes wrote:
> On Sun, Aug 11, 2013 at 11:25 PM, Sekhar Nori <nsekhar@xxxxxx> wrote:
>> On 8/8/2013 5:19 PM, Sekhar Nori wrote:
>>> On Monday 05 August 2013 09:44 PM, Joel Fernandes wrote:
>>>> We certainly don't want error conditions to be cleared any other
>>>> place but the EDMA error handler, as this will make us 'forget'
>>>> about missed events we might need to know errors have occurred.
>>>>
>>>> This fixes a race condition where the EMR was being cleared
>>>> by the transfer completion interrupt handler.
>>>>
>>>> Basically, what was happening was:
>>>>
>>>> Missed event
>>>> |
>>>> |
>>>> V
>>>> SG1-SG2-SG3-Null
>>>> \
>>>> \__TC Interrupt (Almost same time as ARM is executing
>>>> TC interrupt handler, an event got missed and also forgotten
>>>> by clearing the EMR).
>>>>
>>>> This causes the following problems:
>>>>
>>>> 1.
>>>> If error interrupt is also pending and TC interrupt clears the EMR
>>>> by calling edma_stop as has been observed in the edma_callback function,
>>>> the ARM will execute the error interrupt even though the EMR is clear.
>>>> As a result, the dma_ccerr_handler returns IRQ_NONE. If this happens
>>>> enough number of times, IRQ subsystem disables the interrupt thinking
>>>> its spurious which makes error handler never execute again.
>>>>
>>>> 2.
>>>> Also even if error handler doesn't return IRQ_NONE, the removing of EMR
>>>> removes the knowledge about which channel had a missed event, and thus
>>>> a manual trigger on such channels cannot be performed.
>>>>
>>>> The EMR is ultimately being cleared by the Error interrupt handler
>>>> once it is handled so we remove code that does it in edma_stop and
>>>> allow it to happen there.
>>>>
>>>> Signed-off-by: Joel Fernandes <joelf@xxxxxx>
>>>
>>> Queuing this for v3.11 fixes. While committing, I changed the headline
>>> to remove capitalization and made it more readable by removing register
>>> level details. The new headline is:
>>>
>>> ARM: edma: don't clear missed events in edma_stop()
>>
>> Forgot to ask, should this be tagged for stable? IOW, how serious is
>> this race in current kernel (without the entire series applied)? I have
>> never observed it myself - so please provide details how easy/difficult
>> it is to hit this condition.
>
> The race was uncovered by recent EDMA patch series, So this patch can
> go in for next kernel release as such, I am not aware of any other DMA
> user that maybe uncovering the race condition.

Okay, I wont queue for -rc then. If Vinod wants to take this along with
rest of the series, you can add my:

Acked-by: Sekhar Nori <nsekhar@xxxxxx>

Thanks,
Sekhar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/