Re: [LKP] [x86] 811565123a: BUG: kernel hang in early-boot stage, last printk: Probing EDD (edd=off to disable)... ok

From: Ye Xiaolong
Date: Sun Oct 16 2016 - 21:57:30 EST


On 10/14, Andi Kleen wrote:
>On Fri, Oct 14, 2016 at 12:56:00PM +0800, Ye Xiaolong wrote:
>> On 10/14, Ye Xiaolong wrote:
>> >On 10/13, Andi Kleen wrote:
>> >>Andi Kleen <andi@xxxxxxxxxxxxxx> writes:
>> >>
>> >>Any comments on this?
>> >>
>> >>I still cannot reproduce the failure unfortunately.
>> >>
>> >
>> >Btw, you can try below commands to reproduce the error on your local
>> >host, they will download the necessary images and run QEMU:
>> >
>> > git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
>> > cd lkp-tests
>> > bin/lkp qemu -k KERNEL job-script # job-script is attached in the original report email
>>
>> Results show this hang may be related to gcc version, with gcc version
>> 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3), kernel could boot without early
>> hang, but if kernel was built with gcc version 6.2.0 20160901 (Debian
>> 6.2.0-3), it will stuck in the early stage:
>
>I built a mainline gcc 6.2, unfortunately still doesn't reproduce. The kernel
>with your config boots to root.
>
>My guess is that something is broken with paravirt ops on 32bit
>on that compiler.
>
>I created a new patch with the probing code moved to a separate
>function. Can you test if that works?
>

Kernel can boot to root with this new version patch.

Tested-by: Xiaolong Ye <xiaolong.ye@xxxxxxxxx>


Thanks,
Xiaolong


>-Andi
>
>
>commit 65c92de6678f04ce14b237d7073e164b98d9a8be
>Author: Andi Kleen <ak@xxxxxxxxxxxxxxx>
>Date: Wed May 4 06:07:44 2016 -0700
>
> x86: Report Intel platform_id in /proc/cpuinfo
>
> We have a need to distinguish systems based on their platform ID.
> For example this is useful to distinguish systems with L4 cache
> versus ones without.
>
> There is a 3 bit identifier (also called processor flags) in
> the IA32_PLATFORM_ID MSR that can give a more fine grained
> identification of the CPU than just the model number/stepping.
>
> IA32_PLATFORM_ID is architectural.
>
> The MSR can be also accessed through /dev/cpu/*/msr, but that
> requires root and is awkward.
>
> The patch moves the reading of PLATFORM_INFO from the
> (late) microcode driver code into the main intel CPU initialization
> path and then also prints it in /proc/cpuinfo
>
> v2: Handle 0 platform_id. Fix commit message.
> v3: Move some code to cpu/intel.c
> v4: Update description too.
> v5: Move msr probe code out of line to w/a potential gcc 6 bug
> Cc: hmh@xxxxxxxxxx
> Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
>
>diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
>index 63def9537a2d..c1313b3f3e59 100644
>--- a/arch/x86/include/asm/processor.h
>+++ b/arch/x86/include/asm/processor.h
>@@ -135,6 +135,8 @@ struct cpuinfo_x86 {
> /* Index into per_cpu list: */
> u16 cpu_index;
> u32 microcode;
>+ u32 platform_id;
>+ u8 has_platform_id;
> };
>
> #define X86_VENDOR_INTEL 0
>diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
>index fcd484d2bb03..7da7f008cee0 100644
>--- a/arch/x86/kernel/cpu/intel.c
>+++ b/arch/x86/kernel/cpu/intel.c
>@@ -61,6 +61,19 @@ void check_mpx_erratum(struct cpuinfo_x86 *c)
> }
> }
>
>+/* noinline to work around problem with gcc 6.2 */
>+static noinline void probe_platformid(struct cpuinfo_x86 *c)
>+{
>+ if ((c->x86_model >= 5) || (c->x86 > 6)) {
>+ unsigned val[2];
>+
>+ /* get processor flags from MSR 0x17 */
>+ rdmsr(MSR_IA32_PLATFORM_ID, val[0], val[1]);
>+ c->platform_id = (val[1] >> 18) & 7;
>+ c->has_platform_id = true;
>+ }
>+}
>+
> static void early_init_intel(struct cpuinfo_x86 *c)
> {
> u64 misc_enable;
>@@ -211,6 +224,8 @@ static void early_init_intel(struct cpuinfo_x86 *c)
> }
>
> check_mpx_erratum(c);
>+
>+ probe_platformid(c);
> }
>
> #ifdef CONFIG_X86_32
>diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
>index cdc0deab00c9..fab07e49192e 100644
>--- a/arch/x86/kernel/cpu/microcode/intel.c
>+++ b/arch/x86/kernel/cpu/microcode/intel.c
>@@ -855,17 +855,13 @@ static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
> {
> static struct cpu_signature prev;
> struct cpuinfo_x86 *c = &cpu_data(cpu_num);
>- unsigned int val[2];
>
> memset(csig, 0, sizeof(*csig));
>
> csig->sig = cpuid_eax(0x00000001);
>
>- if ((c->x86_model >= 5) || (c->x86 > 6)) {
>- /* get processor flags from MSR 0x17 */
>- rdmsr(MSR_IA32_PLATFORM_ID, val[0], val[1]);
>- csig->pf = 1 << ((val[1] >> 18) & 7);
>- }
>+ if (c->has_platform_id)
>+ csig->pf = 1 << c->platform_id;
>
> csig->rev = c->microcode;
>
>diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
>index 18ca99f2798b..5345d50ed709 100644
>--- a/arch/x86/kernel/cpu/proc.c
>+++ b/arch/x86/kernel/cpu/proc.c
>@@ -76,6 +76,8 @@ static int show_cpuinfo(struct seq_file *m, void *v)
> seq_puts(m, "stepping\t: unknown\n");
> if (c->microcode)
> seq_printf(m, "microcode\t: 0x%x\n", c->microcode);
>+ if (c->has_platform_id)
>+ seq_printf(m, "platform_id\t: %d\n", c->platform_id);
>
> if (cpu_has(c, X86_FEATURE_TSC)) {
> unsigned int freq = cpufreq_quick_get(cpu);