Re: [PATCH] ARM: edma: Clear IRQ if we get interrupted by a unknown event
From: Roger Quadros
Date: Mon Jul 13 2015 - 10:32:43 EST
Sekhar,
On 13/07/15 17:12, Sekhar Nori wrote:
> Hi Roger,
>
> On Monday 13 July 2015 06:12 PM, Roger Quadros wrote:
>> It looks like we can get an interrupt even when none of
>> the events that we're expecting occur. This can typically
>> happen due to incorrect usage of the dma engine by the
>> clients. If we don't Ack this interrupt then we get a
>> flood of them and then it gets disabled by the IRQ core.
>> (e.g. see backtrace below on am437x-sk-evm)
>>
>> Ack such an interrupt and print an error message so that
>> developers can identify the problem.
>
> The patch looks good except a comment (see below). Do you need this in
> -rc though? Looks like the final fix is still in mmc driver?
No need to hurry for -rc. Can go in via -next.
This patch does not fix the root cause of incorrect dma users
but just avoids the interrupt from being permanently disabled
and prints a message on the console.
>
>>
>> The following error was seen at boot on am437x-sk-evm
>>
>> [ 7.395763] irq 46: nobody cared (try booting with the "irqpoll" option)
>> [ 7.402499] CPU: 0 PID: 861 Comm: mmcqd/0 Not tainted 3.14.37-00337-g1b4e893 #1116
>> [ 7.410099] Backtrace:
>> [ 7.412588] [<c0011e84>] (dump_backtrace) from [<c0012020>] (show_stack+0x18/0x1c)
>> [ 7.420189] r6:0000002e r5:00000000 r4:00000000 r3:00000000
>> [ 7.425910] [<c0012008>] (show_stack) from [<c0600e4c>] (dump_stack+0x78/0x94)
>> [ 7.433178] [<c0600dd4>] (dump_stack) from [<c007b0fc>] (__report_bad_irq+0x28/0xc8)
>> [ 7.440950] r4:ec806500 r3:c088f358
>> [ 7.444558] [<c007b0d4>] (__report_bad_irq) from [<c007b6a4>] (note_interrupt+0x25c/0x2b8)
>> [ 7.452854] r6:0000002e r5:00000000 r4:ec806500 r3:0001863c
>> [ 7.458569] [<c007b448>] (note_interrupt) from [<c00791a0>] (handle_irq_event_percpu+0xb0/0x1a0)
>> [ 7.467389] r10:ec806500 r9:c08cb03b r8:0000002e r7:00000000 r6:00000000 r5:00000000
>> [ 7.475285] r4:00000000 r3:00000000
>> [ 7.478892] [<c00790f0>] (handle_irq_event_percpu) from [<c00792dc>] (handle_irq_event+0x4c/0x6c)
>> [ 7.487797] r10:00000006 r9:600f0113 r8:600f0113 r7:fa240100 r6:ecc3dcc0 r5:ec80655c
>> [ 7.495694] r4:ec806500
>> [ 7.498247] [<c0079290>] (handle_irq_event) from [<c007c3e0>] (handle_fasteoi_irq+0x84/0x150)
>> [ 7.506804] r6:ecc3dcc0 r5:0000002e r4:ec806500 r3:00000000
>> [ 7.512518] [<c007c35c>] (handle_fasteoi_irq) from [<c0078a7c>] (generic_handle_irq+0x28/0x38)
>> [ 7.521162] r4:0000002e r3:c007c35c
>> [ 7.524769] [<c0078a54>] (generic_handle_irq) from [<c000f238>] (handle_IRQ+0x40/0x9c)
>> [ 7.532716] r4:c0865ea8 r3:000001a0
>> [ 7.536321] [<c000f1f8>] (handle_IRQ) from [<c0008668>] (gic_handle_irq+0x30/0x64)
>> [ 7.543920] r6:ecc3dbd8 r5:c08709a0 r4:fa24010c r3:00000100
>> [ 7.549640] [<c0008638>] (gic_handle_irq) from [<c06062c0>] (__irq_svc+0x40/0x50)
>> [ 7.557154] Exception stack(0xecc3dbd8 to 0xecc3dc20)
>> [ 7.562226] dbc0: c08cccc0 00000000
>> [ 7.570439] dbe0: 0000000a 00000000 00000040 0000002c 00000000 ecc3c000 600f0113 600f0113
>> [ 7.578652] dc00: 00000006 ecc3dc64 ecc3dc20 ecc3dc20 c0040630 c00406a8 200f0113 ffffffff
>> [ 7.586860] r7:ecc3dc0c r6:ffffffff r5:200f0113 r4:c00406a8
>> [ 7.592581] [<c0040618>] (__do_softirq) from [<c0040ab8>] (irq_exit+0xa8/0xf8)
>> [ 7.599829] r10:00000006 r9:600f0113 r8:600f0113 r7:fa240100 r6:00000000 r5:0000002c
>> [ 7.607724] r4:ecc3c000
>> [ 7.610276] [<c0040a10>] (irq_exit) from [<c000f23c>] (handle_IRQ+0x44/0x9c)
>> [ 7.617351] r4:c0865ea8 r3:000001a0
>> [ 7.620955] [<c000f1f8>] (handle_IRQ) from [<c0008668>] (gic_handle_irq+0x30/0x64)
>> [ 7.628552] r6:ecc3dcc0 r5:c08709a0 r4:fa24010c r3:00000100
>> [ 7.634265] [<c0008638>] (gic_handle_irq) from [<c06062c0>] (__irq_svc+0x40/0x50)
>> [ 7.641777] Exception stack(0xecc3dcc0 to 0xecc3dd08)
>> [ 7.646850] dcc0: c08cf288 600f0193 c088f358 c088f358 c08cf288 00000001 00000027 c088f34c
>> [ 7.655062] dce0: 600f0113 600f0113 00000006 ecc3dd6c ecc3dcc0 ecc3dd08 c0076bac c00771f0
>> [ 7.663272] dd00: 600f0113 ffffffff
>> [ 7.666771] r7:ecc3dcf4 r6:ffffffff r5:600f0113 r4:c00771f0
>> [ 7.672485] [<c0076fd0>] (vprintk_emit) from [<c05fef40>] (printk+0x3c/0x44)
>> [ 7.679560] r10:00001ffe r9:00001000 r8:00000010 r7:00002000 r6:c08a5b00 r5:c08a5b2c
>> [ 7.687455] r4:c08a5a60
>> [ 7.690011] [<c05fef08>] (printk) from [<c0359684>] (credit_entropy_bits+0x238/0x260)
>> [ 7.697870] r3:00000002 r2:00000008 r1:c078cfa0 r0:c078cec4
>> [ 7.703583] [<c035944c>] (credit_entropy_bits) from [<c0359944>] (add_timer_randomness+0xd4/0xe4)
>> [ 7.712489] r10:ecc24008 r9:ecc24c00 r8:ecc1fb08 r7:00000000 r6:00000000 r5:c08a5b00
>> [ 7.720386] r4:ecc0b440
>> [ 7.722938] [<c0359870>] (add_timer_randomness) from [<c035a554>] (add_disk_randomness+0x2c/0x30)
>> [ 7.731844] r5:00000000 r4:ecc1fb08
>> [ 7.735451] [<c035a528>] (add_disk_randomness) from [<c027d888>] (blk_update_bidi_request+0x50/0x74)
>> [ 7.744625] [<c027d838>] (blk_update_bidi_request) from [<c027dbb0>] (blk_end_bidi_request+0x1c/0x58)
>> [ 7.753879] r6:00000000 r5:ecc28c00 r4:ecc1fb08 r3:00000000
>> [ 7.759592] [<c027db94>] (blk_end_bidi_request) from [<c027dc2c>] (blk_end_request+0x14/0x18)
>> [ 7.768149] r8:ecc1fb08 r7:00000000 r6:ecc24000 r5:00000000 r4:ecc24250 r3:00000000
>> [ 7.775973] [<c027dc18>] (blk_end_request) from [<c04cb48c>] (mmc_blk_issue_rw_rq+0x8c4/0xbd8)
>> [ 7.784625] [<c04cabc8>] (mmc_blk_issue_rw_rq) from [<c04cb96c>] (mmc_blk_issue_rq+0x1cc/0x4b8)
>> [ 7.793358] r10:00000001 r9:00000000 r8:ecc24000 r7:ecbfc29c r6:00000000 r5:ecc24008
>> [ 7.801255] r4:ecc24c00
>> [ 7.803807] [<c04cb7a0>] (mmc_blk_issue_rq) from [<c04cc4f8>] (mmc_queue_thread+0xb8/0x14c)
>> [ 7.812191] r10:00000001 r9:ecc24010 r8:00000000 r7:00000000 r6:ecc3c000 r5:ecc28c00
>> [ 7.820087] r4:ecc24008
>> [ 7.822648] [<c04cc440>] (mmc_queue_thread) from [<c00582c8>] (kthread+0xcc/0xe8)
>> [ 7.830159] r10:00000000 r9:00000000 r8:00000000 r7:c04cc440 r6:ecc24008 r5:ecc0b300
>> [ 7.838056] r4:00000000 r3:ecb1db40
>> [ 7.841663] [<c00581fc>] (kthread) from [<c000e9d8>] (ret_from_fork+0x14/0x3c)
>> [ 7.848911] r7:00000000 r6:00000000 r5:c00581fc r4:ecc0b300
>> [ 7.854618] handlers:
>> [ 7.856905] [<c001e8c0>] dma_ccerr_handler
>> [ 7.861020] Disabling IRQ #46
>>
>> Acked-by: Peter Ujfalusi <peter.ujfalusi@xxxxxx>
>> Signed-off-by: Roger Quadros <rogerq@xxxxxx>
>> ---
>> arch/arm/common/edma.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
>> index 873dbfc..356281d 100644
>> --- a/arch/arm/common/edma.c
>> +++ b/arch/arm/common/edma.c
>> @@ -435,8 +435,11 @@ static irqreturn_t dma_ccerr_handler(int irq, void *data)
>> if ((edma_read_array(ctlr, EDMA_EMR, 0) == 0) &&
>> (edma_read_array(ctlr, EDMA_EMR, 1) == 0) &&
>> (edma_read(ctlr, EDMA_QEMR) == 0) &&
>> - (edma_read(ctlr, EDMA_CCERR) == 0))
>> + (edma_read(ctlr, EDMA_CCERR) == 0)) {
>> + dev_err(data, "%s: unmanaged event occured\n", __func__);
>> + edma_write(ctlr, EDMA_EEVAL, 1);
>
> Instead of writes to EDMA_EEVAL in multiple places, can you implement a
> goto based error recovery path? I think that will be easier to parse.
>
OK.
cheers,
-roger
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/