Re: [4.14-rc0 regression] Re: x60: warnings on boot and resume, arch/x86/mm/tlb.c:257 initialize_ ... was Re: [PATCH 0/2] Fix resume failure due to PCID

From: Andy Lutomirski
Date: Fri Sep 15 2017 - 17:06:55 EST


On Fri, Sep 15, 2017 at 12:29 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>
>
>> On Sep 15, 2017, at 11:47 AM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>
>>> On Fri, Sep 15, 2017 at 3:22 AM, Pavel Machek <pavel@xxxxxx> wrote:
>>>
>>> Let me pull latest...
>>>
>>> 711aab1dbb324d321e3d84368a435a78908c7bce
>>>
>>> (Strange. Not authored by Linus and old?)
>>
>> That's the author date, the committer date is new. Top of tree right
>> now just happens to be a patch I applied, it's much more commonly a
>> merge I've done.
>>
>>> But result is still similar, this time with more debug information.
>>> [ 0.116813] x86: Booting SMP configuration:
>>> [ 0.116893] .... node #0, CPUs: #1
>>> [ 0.004000] Initializing CPU#1
>>> [ 0.004000] ------------[ cut here ]------------
>>> [ 0.004000] WARNING: CPU: 1 PID: 0 at arch/x86/mm/tlb.c:257 initialize_tlbstate_and_flush+0x2e/0xed
>>> [ 0.004000] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW (2.19 ) 03/31/2011
>>> [ 0.004000] task: f5ca2080 task.stack: f5cc4000
>>> [ 0.004000] EIP: initialize_tlbstate_and_flush+0x2e/0xed
>>> [ 0.004000] EFLAGS: 00210087 CPU: 1
>>> [ 0.004000] EAX: 0504b000 EBX: c4f15540 ECX: c4f15710 EDX: 00000000
>>> [ 0.004000] ESI: 04ee7000 EDI: f5ca2080 EBP: f5cc5f54 ESP: f5cc5f44
>>> [ 0.004000] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
>>> [ 0.004000] CR0: 80050033 CR2: 00000000 CR3: 04ee7000 CR4: 000006b0
>>> [ 0.004000] Call Trace:
>>> [ 0.004000] cpu_init+0xdc/0x2f0
>>> [ 0.004000] start_secondary+0x34/0x1c6
>>> [ 0.004000] startup_32_smp+0x164/0x166
>>> [ 0.004000] ? startup_32_smp+0x164/0x166
>>> [ 0.004000] Code: 56 53 83 ec 08 64 8b 1d c0 c0 03 c5 b9 10 57 f1 c4 e8 de 65 9e 00 89 45 f0 89 55 f4 0f 20 de 8b 43 20 05 00 00 00 40 39 c6 74 11 <0f> ff 50 56 68 74 9c d8 c4 e8 d1 cd 04 00 83 c4 0c a1 20 90 f8
>>> [ 0.004000] ---[ end trace 7439e29925a49b51 ]---
>>> [ 0.004000] # CR3: 0000000004ee7000, __pa(mm->pgd): 000000000504b000
>>
>> Ok, clearly Andy didn't get the 32-bit SMP bringup path right.
>> Presumable tested a 32-bit UP image, or in a single-cpu VM?
>>
>> Andy?
>
> The warning only triggers on 32-bit SMP. The issue seems to be that boot_cpu_has(X86_FEATURE_PCID) is true despite setup_clear_cpu_cap. I'm still trying to figure out why. I'll hopefully have a patch after lunch.

Nah, that was a false alarm although there's arguably a real bug here.
But I found the issue causing the warning.

x86's boot code is so inconsistent and weird that I'm a bit surprised
that it works at all sometimes.

>
>>
>> Linus