Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

From: Andi Kleen
Date: Tue May 02 2006 - 11:45:16 EST


On Tuesday 02 May 2006 17:17, Martin J. Bligh wrote:
> Andi Kleen wrote:
> > On Tuesday 02 May 2006 16:02, Martin J. Bligh wrote:
> >
> >>>Oh that's a 32bit kernel. I don't think the 32bit NUMA has ever worked
> >>>anywhere but some Summit systems (at least every time I tried it it blew up
> >>>on me and nobody seems to use it regularly). Maybe it would be finally time to mark it
> >>>CONFIG_BROKEN though or just remove it (even by design it doesn't work very well)
> >>
> >>Bollocks. It works fine,
> >
> > On what kind of box? Some summit system, right?
>
> Summit and NUMA-Q, ie everything we originally created it for.

That's my point - it usually crashes everywhere else.

>
> > I think I stand by my original statement.
>
> If it works fine on some platforms and not on others, I would venture to
> suggest it's a platform-specific issue, and marking the whole thing as
> CONFIG_BROKEN would be an entirely inappropriate overreaction to what
> is probably a simple bug.

The problem is that it's not regression tested and quite complex and tends
to break often and stay broken.

If you don't want to mark it CONFIG_BROKEN then i would suggest a panic
early when the system isn't SUMMIT (I think NUMAQ does this already)

Something like the appended patch

-Andi

i386: Panic the system early when a NUMA kernel doesn't run on IBM NUMA

It has been broken forever anywhere else and is not too useful
anyways so best to disable it.

Signed-off-by: Andi Kleen <ak@xxxxxxx>

Index: linux/arch/i386/kernel/srat.c
===================================================================
--- linux.orig/arch/i386/kernel/srat.c
+++ linux/arch/i386/kernel/srat.c
@@ -327,6 +327,14 @@ int __init get_memcfg_from_srat(void)
int tables = 0;
int i = 0;

+ extern int use_cyclone;
+ if (use_cyclone == 0) {
+ /* Make sure user sees something */
+ static const char s[] __initdata = "Not an IBM x440/NUMAQ. Don't use i386 CONFIG_NUMA anywhere else."
+ early_printk(s);
+ panic(s);
+ }
+
if (ACPI_FAILURE(acpi_find_root_pointer(ACPI_PHYSICAL_ADDRESSING,
rsdp_address))) {
printk("%s: System description tables not found\n",
Index: linux/arch/i386/Kconfig
===================================================================
--- linux.orig/arch/i386/Kconfig
+++ linux/arch/i386/Kconfig
@@ -517,6 +517,9 @@ config NUMA
depends on SMP && HIGHMEM64G && (X86_NUMAQ || X86_GENERICARCH || (X86_SUMMIT && ACPI))
default n if X86_PC
default y if (X86_NUMAQ || X86_SUMMIT)
+ help
+ NUMA support. Note this only works on IBM x440 or IBM NUMAQ.
+ Don't try to use it anywhere else.

comment "NUMA (Summit) requires SMP, 64GB highmem support, ACPI"
depends on X86_SUMMIT && (!HIGHMEM64G || !ACPI)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/