Re: [PATCH 0/3]: Discard reserved PXM bits for SRAT v1

From: xb
Date: Tue May 19 2009 - 05:02:07 EST


Hi Kurt,

We are hitting this problem on some Nehalem based platforms, that prevents a correct Numa initialization.
We propsed a patch on linux_acpi to fix it.
We are OK to have your patch go into the mainstream to have this fixed quickly.
Thanks.
Xavier

Kurt Garloff wrote:
Hi,

ACPI specification says that the OS must disregard reserved bits.
The x86_64 SRAT parser does not discard the upper 24 bits of the
proximity_domain (pxm) in the acpi_srat_mem_affinity entries for
SRAT v1 tables. (v2 has 32 bits wide fields.)
This can lead to problems with poor BIOS implementations that failed
to set resreved bytes to zero. (The ACPI spec is a bit vague here
unfortunately.)

This was also inconsistent: On x86-64 (srat_64.c), the _cpu_affinity does only use the low 8 bits of pxm, while the
full 32 bits of _mem_affinity are consumed.
In srat_32.c (x86), only 8bits are used (which is OK, a 32bit system
with >256 PXMs does not seem reasonable at all).
On ia64, the support of more than 8 bits was consistent between
mem and cpu affinity entries, however it dependent on "sn2" platform.

The patch series has the following goals:
* Make the kernel support consistently 8bits or 32bits for the
proximity domain
* Make this dependent on the SRAT version; v1 => 8bits, v2 => 32bits.

Overview over the patches:
- [1/3] Store the SRAT table version value in acpi_srat_revision - [2/3] x86-64: Discard the upper 24 bits in mem_affinity if rev <= 1
and use upper 24bits in cpu_affinity if rev >= 2
- [3/3] ia64: Also use upper 8/24bits if rev >= 2 (but leave logic to
enable on sn2 as well -- I don't know if sn2 reports v1 or v2
SRAT) Also add two __init decls in ia64 pxm accessors.

Patch has been tested on x86-64 against an 2.6.27.x kernel.
(Patch is against current git.)

Thanks for James, Greg, Alexey, Norbert for comments, review and testing.

Please review and apply!

Greg, I believe this is a candidate for -stable.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/