RE: [PATCH] x86: skip migrating percpu irq in fixup_irqs

From: Tian, Kevin
Date: Thu May 05 2011 - 05:03:21 EST


> From: Ian Campbell [mailto:Ian.Campbell@xxxxxxxxxxxxx]
> Sent: Thursday, May 05, 2011 4:49 PM
>
> On Thu, 2011-05-05 at 09:15 +0100, Tian, Kevin wrote:
> > x86: skip migrating percpu irq in fixup_irqs
> >
> > IRQF_PER_CPU is used by Xen for a set of virtual interrupts binding to
> > a specific vcpu, but it's not recognized in fixup_irqs which simply
> > atempts to migrate it to other vcpus. In most cases this just works,
> > because Xen event channel chip silently fails the set_affinity ops for
> > irqs marked as IRQF_PER_CPU. But there's a special category (like used
> > for pv spinlock) also adds a IRQ_DISABLED flag, used as polled only
> > event channels which don't expect any instance injected. However
> > fixup_irqs absolutely masks/unmasks irqs which makes such special type
> > irq injected unexpectely while there's no handler for it.
> >
> > This error is triggered on some box when doing reboot. The fix is to
> > recognize IRQF_PER_CPU flag early before the affinity change, which
> > actually matches the rationale of IRQF_PER_CPU.
>
> Skipping affinity fixup for PER_CPU IRQs seems logical enough (so I suspect this
> patch is good in its own right) but if the real issue you are fixing is that
> IRQ_DISABLED IQRs are getting unmasked should that issue be addressed
> directly rather than relying on it being a side-effect of PER_CPU-ness?

actually this is one thing I'm not sure and would like to hear more suggestions.
imo affinity and percpu are same type of attributes, which describe the
constraints to host a given irq, while disabled/masked are another type of
attributes which describe whether to accept an irq instance (mask is for
short period). even without the IRQ_DISABLED issue, this patch is also
required though it doesn't expose observable failure. I didn't bake the 2nd
patch to avoid unmask for IRQ_DISABLED, because I want to know whether
this is desired, e.g. IRQ_DISABLED is set in the fly when there can be still
one instance pending?

Thanks
Kevin

>
> > Signed-off-by: Fengzhe Zhang <fengzhe.zhang@xxxxxxxxx>
> > Signed-off-by: Kevin Tian <kevin.tian@xxxxxxxxx>
> > CC: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > CC: Ingo Molnar <mingo@xxxxxxxxxx>
> > CC: H. Peter Anvin <hpa@xxxxxxxxx>
> > CC: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
> > CC: Jan Beulich <JBeulich@xxxxxxxxxx>
> >
> > diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index
> > 1cb0b9f..544efe2 100644
> > --- a/arch/x86/kernel/irq.c
> > +++ b/arch/x86/kernel/irq.c
> > @@ -249,7 +249,7 @@ void fixup_irqs(void)
> >
> > data = irq_desc_get_irq_data(desc);
> > affinity = data->affinity;
> > - if (!irq_has_action(irq) ||
> > + if (!irq_has_action(irq) || irqd_is_per_cpu(data) ||
> > cpumask_subset(affinity, cpu_online_mask)) {
> > raw_spin_unlock(&desc->lock);
> > continue;
>

èº{.nÇ+‰·Ÿ®‰­†+%ŠËlzwm…ébëæìr¸›zX§»®w¥Š{ayºÊÚë,j­¢f£¢·hš‹àz¹®w¥¢¸ ¢·¦j:+v‰¨ŠwèjØm¶Ÿÿ¾«‘êçzZ+ƒùšŽŠÝj"ú!¶iO•æ¬z·švØ^¶m§ÿðà nÆàþY&—