Re: [Regression] "irqdomain: Don't set type when mapping an IRQ" breaks nexus7 gpio buttons
From: Jon Hunter
Date: Mon Aug 08 2016 - 05:38:53 EST
On 06/08/16 00:45, John Stultz wrote:
> On Mon, Aug 1, 2016 at 3:26 AM, Jon Hunter <jonathanh@xxxxxxxxxx> wrote:
>> Hi John,
>>
>> On 30/07/16 05:39, John Stultz wrote:
>>> Hey Jon,
>>> So after rebasing my nexus7 patch stack onto pre-4.8-rc1 tree, I
>>> noticed the power/volume buttons stopped working.
>>>
>>> I did a manual rebased bisection and chased it down to your commit
>>> 1e2a7d78499e ("irqdomain: Don't set type when mapping an IRQ").
>>>
>>> Reverting that patch makes things work again, so I wanted to see if
>>> there was any debugging info I could provide to try to help narrow
>>> down the problem here. (Sorry, I'd tinker myself with it some and try
>>> to debug the issue, but after burning my friday night on this, I'm
>>> eager to get away from the keyboard for the weekend).
>>
>> Before this commit bad IRQ type settings in device-tree were not getting
>> reported and so failures to set the IRQ type were going unnoticed. It's
>> most likely a bad IRQ type settings somewhere.
>>
>> As Thomas mentioned hopefully dmesg will shed a bit more light.
>>
>> Otherwise it can be worth looking at the ->irq_set_type() function for
>> the irqchips in the path of the interrupt requested to see if any are
>> failing. Looking at the nexus7 (assuming qcom variant), it looks like
>> there are 3 irqchips in the path (pm8921 --> apq8064-pinctrl --> gic).
>> The pm8xxx_irq_set_type() could return a failure when setting up the IRQ
>> type and could be worth checking. It does not look like the set_type for
>> the apq8064-pinctrl should ever fail (apart from calling BUG() which
>> would be obvious). The gic can also return a failure for setting the
>> type, but I did not see anything at first glance that looks incorrect in
>> the dts.
>>
>> If we can narrow down irqchip, then hopefully it will be clearer.
>
> The pm_8xxx_irq_set_type doesn't seem to be failing as far as I can see..
>
> Looking at the patch that seems to cause the trouble, I narrowed it
> down to just the following chunk:
>
> @@ -614,7 +615,11 @@ unsigned int irq_create_fwspec_mapping(struct
> irq_fwspec *fwspec)
> * it now and return the interrupt number.
> */
> if (irq_get_trigger_type(virq) == IRQ_TYPE_NONE) {
> - irq_set_irq_type(virq, type);
> + irq_data = irq_get_irq_data(virq);
> + if (!irq_data)
> + return 0;
> +
> + irqd_set_trigger_type(irq_data, type);
> return virq;
> }
>
> If I revert just that, it works again.
>
> I was worried we were hitting an early failure from !irq_data, but it
> seems there's some subtle difference between irqd_set_trigger_type and
> irq_set_type that makes the former break for me.
Thanks this is good info and at the same time odd.
I am guessing that it is failing above because the irq_data is not found
for the irq?
What is odd, is that the above sequence is only executed if a irq
mapping exists and so really, AFAICT this should not happen. Ie. the irq
descriptor should have been allocated for the mapping to exist. We
should probably warn if this happens.
Without reverting the above, can you add a print to show the
domain->name, hwirq and virq information if !irq_data? That will confirm
the domain for us.
Also it could be worth enabling all the debug prints in
kernel/irq/irqdomain.c to see what is happening on boot.
Cheers
Jon
--
nvpublic