Re: 3.2.1 Unable to reset IRR messages on boot
From: Josh Boyer
Date: Mon Mar 12 2012 - 09:25:35 EST
On Wed, Feb 01, 2012 at 12:00:30AM -0800, Suresh Siddha wrote:
> On Tue, 2012-01-31 at 09:26 -0500, Josh Boyer wrote:
> > On Wed, Jan 25, 2012 at 06:15:35PM -0500, Josh Boyer wrote:
> > > On Wed, Jan 25, 2012 at 02:04:08PM -0800, Suresh Siddha wrote:
> > > > On Wed, 2012-01-25 at 08:49 -0500, Josh Boyer wrote:
> > > > > [ 0.000000] IOAPIC[1]: apic_id 2, version 255, address 0xfec28000, GSI 24-279
> > > >
> > > > This looks indeed like a bogus entry probably returning all 1's for
> > > > RTE's etc. Can you please send me a dmesg with "apic=verbose" boot
> > > > parameter?
> > >
> > > Here you go:
> > >
> > > https://bugzilla.redhat.com/attachment.cgi?id=557552
> >
> > Was this helpful at all? I've been watching lkml for a related patch
> > in case I was missed on CC but haven't seen anything as of yet.
>
> Yes, it was helpful. Something like the appended patch should ignore the
> bogus io-apic entry all together. As I can't test this, can you or the
> reporter give the appended patch a try and ack please?
Hi Suresh,
Apologies for the delay. The original reporter had to return the
machine he was using. We've since had another report where this
happened and your patch below does indeed fix the issue.
I'd suggest pushing this soon.
https://bugzilla.redhat.com/show_bug.cgi?id=801501
josh
>
> thanks,
> suresh
> ---
>
> From: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
> Subject: x86, ioapic: add register checks for bogus io-apic entries
>
> With the recent changes to clear_IO_APIC_pin() which tries to clear
> remoteIRR bit explicitly, some of the users started to see
> "Unable to reset IRR for apic .." messages.
>
> Close look shows that these are related to bogus IO-APIC entries which
> return's all 1's for their io-apic registers. And the above mentioned error
> messages are benign. But kernel should have ignored such io-apic's in the
> first place.
>
> Check if register 0, 1, 2 of the listed io-apic are all 1's and ignore
> such io-apic.
>
> Reported-by: Álvaro Castillo <midgoon@xxxxxxxxx>
> Signed-off-by: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
> ---
> arch/x86/kernel/apic/io_apic.c | 26 ++++++++++++++++++++++++++
> 1 files changed, 26 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index fb07275..953e54d 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -3979,6 +3979,26 @@ static __init int bad_ioapic(unsigned long address)
> return 0;
> }
>
> +static __init int bad_ioapic_regs(int idx)
> +{
> + union IO_APIC_reg_00 reg_00;
> + union IO_APIC_reg_01 reg_01;
> + union IO_APIC_reg_02 reg_02;
> +
> + reg_00.raw = io_apic_read(idx, 0);
> + reg_01.raw = io_apic_read(idx, 1);
> + reg_02.raw = io_apic_read(idx, 2);
> +
> + if (reg_00.raw == -1 && reg_01.raw == -1 && reg_02.raw == -1) {
> + printk(KERN_WARNING
> + "I/O APIC 0x%x regs return all ones, skipping!\n",
> + mpc_ioapic_addr(idx));
> + return 1;
> + }
> +
> + return 0;
> +}
> +
> void __init mp_register_ioapic(int id, u32 address, u32 gsi_base)
> {
> int idx = 0;
> @@ -3995,6 +4015,12 @@ void __init mp_register_ioapic(int id, u32 address, u32 gsi_base)
> ioapics[idx].mp_config.apicaddr = address;
>
> set_fixmap_nocache(FIX_IO_APIC_BASE_0 + idx, address);
> +
> + if (bad_ioapic_regs(idx)) {
> + clear_fixmap(FIX_IO_APIC_BASE_0 + idx);
> + return;
> + }
> +
> ioapics[idx].mp_config.apicid = io_apic_unique_id(id);
> ioapics[idx].mp_config.apicver = io_apic_get_version(idx);
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/