Re: [GIT PULL] x86/mm changes for v4.14: PCID support, 5-level paging support, Secure Memory Encryption support

From: Andy Lutomirski
Date: Wed Sep 06 2017 - 18:29:38 EST


On Wed, Sep 6, 2017 at 2:16 PM, Jiri Kosina <jikos@xxxxxxxxxx> wrote:
> On Wed, 6 Sep 2017, Jiri Kosina wrote:
>
>> This is a "me too", observed on my Lenovo thinkpad x270 (so it's not
>> specific to that XPS 13 system at all).
>>
>> The symptom I observe is that an attempt to resume from hibernation
>> proceeds up to reading 100% of the hibernation image, and then reboot
>> happens (IOW looks like triple fault).
>>
>> nopcid cures it, I haven't tried to revert 10af6235e0d3 yet, but looks
>> like it's the same thing.
>
> [ reposting the information again with LKML re-introduced to CC ]
>
> As suggested by Andy off-list, I tested with this change to always force
> ASID 0
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 5ca71d1..c3b0811 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -35,7 +35,7 @@ static void choose_new_asid(struct mm_struct *next, u64 next_tlb_gen,
> {
> u16 asid;
>
> - if (!static_cpu_has(X86_FEATURE_PCID)) {
> + if (true || !static_cpu_has(X86_FEATURE_PCID)) {
> *new_asid = 0;
> *need_flush = true;
> return;
>
> and that fixes the issue on my system.


I got Linus' config to boot. The problem was that I ended up with a
root-owned file (not sure which) in my tree that cause an incorrect
build but didn't generate errors. I don't know how this happened, but
an ill-timed sudo make -j4 modules_install install was probably
involved. git clean -ffxxxd , did *not* fix it or even notice it in
any obvious way.

Anyway, the problem appears to depend on kernel config because it's
dying here on resume on secondary cpus:

VM_BUG_ON(__read_cr3() != (__sme_pa(real_prev->pgd) | prev_asid));

in switch_mm_irqs_off().

What seems to be going on is that the wakeup CPU is exactly restoring
original state. All other CPUs are restoring swapper_pg_dir but are
failing to restore the PCID tag bits, which trips the assertion w.p.
5/6 per non-boot CPU. So, if you have that debug option set, you die
w.p. 1 - (1/6)^(cpus - 1), which is pretty large.

I'll come up with a clean fix this evening, I hope.