Re: need help on a DEADLOCK problem related to function try_one_irq()
From: Edward Donovan
Date: Thu Dec 06 2012 - 01:10:50 EST
Hello Thomas, Stoney, Warner -
Thanks for your patience. :) The patch from Thomas works here, too.
I can add that when I made the patch that became 52553ddffad76cc, I
didn't understand why we needed to run the handler again, in that
test:
(action->handler(irq, action->dev_id) == IRQ_HANDLED)
But without it, irqfixup and irqpoll weren't working. And I can't
pretend I have a clear idea of what all this code should like, anyway,
myself. I and others had bisected the breaking of irqpoll/fixup to
commit fa27271bc8d. I was testing the parts of that small patch, like
puzzle pieces. That was one the one that worked for me and everyone
else, until now. I thought we might have a discussion about it, but
that didn't happen. The first time I submitted it, I hadn't heard
back. Then another user bugged Linus about the regression. I piped
up that I had submitted a patch. Users tested it successfully, and it
went in.
So, let me try to confirm some things now, so I can learn as I go. To
spell out what's in the new patch:
The later line,
action = desc->action;
should have been this all along?
action = action->next;
And with that fixed, the other test can go. Which is good, now,
because it was creating locking problems for you.
Ok, thanks for bearing with me. I'll repeat the test here, some
more, but they've generally behaved consistently and probably will
keep showing the same.
I might as well add that I would happily ship one of these computers,
with a reliably spurious interrupt, to Thomas in Germany. if that
would help for testing.
Thanks all -
Ed
On Mon, Nov 26, 2012 at 3:09 AM, Wang, Warner <warner.wang@xxxxxx> wrote:
> Hi Thomas and Edward,
>
> This patch works fine for our problems, but I'm not sure if it works for the recent submit "genirq: fix regression in irqfixup, irqpoll" "52553ddffad76ccf192d4dd9ce88d5818f57f62a", which submitted by Edward Donovan.
>
> Edward can you help verify it on your environment?
>
>
> Thanks,
> -Warner
>
> -----Original Message-----
> From: Thomas Gleixner [mailto:tglx@xxxxxxxxxxxxx]
> Sent: 2012å11æ23æ PM 5:09
> To: Wang, Warner
> Cc: Wang, Song-Bo (Stoney)
> Subject: Re: need help on a DEADLOCK problem related to function try_one_irq()
>
> On Thu, 22 Nov 2012, Thomas Gleixner wrote:
>
>> Warner,
>>
>> On Thu, 22 Nov 2012, Wang, Warner wrote:
>>
>> please send such bug reports to the kernel mailinglist in the future.
>>
>> > We met a problem on some of our x86 server and after a few weeks
>> > trace-down effort, we believe the problem is in the file
>> > "linux/kernel/irq/spurious.c". We created a patch but we are not
>> > 100% sure if it is correct or complete. That is why we want to
>> > consult you.
>>
>> You spotted the problem right, but I'm not sure at the first glance,
>> whether this is correct. I need to go back into history and figure out
>> why we added that call in the first place. It looks fundamentally
>> wrong.
>>
>> Thanks for analyzing it. I'll keep you posted on my findings.
>
> Can you try the patch below ?
>
> Thanks,
>
> tglx
>
> ---
> kernel/irq/spurious.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> Index: tip/kernel/irq/spurious.c
> ===================================================================
> --- tip.orig/kernel/irq/spurious.c
> +++ tip/kernel/irq/spurious.c
> @@ -80,13 +80,11 @@ static int try_one_irq(int irq, struct i
>
> /*
> * All handlers must agree on IRQF_SHARED, so we test just the
> - * first. Check for action->next as well.
> + * first.
> */
> action = desc->action;
> if (!action || !(action->flags & IRQF_SHARED) ||
> - (action->flags & __IRQF_TIMER) ||
> - (action->handler(irq, action->dev_id) == IRQ_HANDLED) ||
> - !action->next)
> + (action->flags & __IRQF_TIMER))
> goto out;
>
> /* Already running on another processor */ @@ -104,7 +102,7 @@ static int try_one_irq(int irq, struct i
> do {
> if (handle_irq_event(desc) == IRQ_HANDLED)
> ret = IRQ_HANDLED;
> - action = desc->action;
> + action = action->next;
> } while ((desc->istate & IRQS_PENDING) && action);
> desc->istate &= ~IRQS_POLL_INPROGRESS;
> out:
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/