Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

From: Ingo Molnar
Date: Sat Jan 20 2018 - 07:34:04 EST



* Laura Abbott <labbott@xxxxxxxxxx> wrote:

> The machines I have are a Lenovo X1 Carbon and a Lenovo T470s.
> A Lenovo X220 ThinkPad also reported the problem.
>
> If I comment out sme_encrypt_kernel it boots:
>
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 7ba5d819ebe3..443ef5d3f1fa 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -158,7 +158,7 @@ unsigned long __head __startup_64(unsigned long
> physaddr,
> *p += load_delta - sme_get_me_mask();
>
> /* Encrypt the kernel and related (if SME is active) */
> - sme_encrypt_kernel(bp);
> + //sme_encrypt_kernel(bp);
>
> /*
> * Return the SME encryption mask (if SME is active) to be used as a
>
>
> Interestingly, I tried to print the values in sme_active
> (sme_me_mask , sev_enabled) followed by a return at the
> very start of sme_encrypt_kernel and that rebooted as well,
> vs booting if I just kept the return. sme_me_mask and
> sev_enabled are explicitly marked as being in .data,
> is it possible they are ending up in a section that isn't
> yet mapped or did I hit print too early?

So all this is in awfully early code.

I think you should only be able to use early_printk here - is that what you are
using?

As like Linus I don't see anything explicitly wrong in the patch, it obviously
made a difference to you and others, and the commenting out experiment verifies
the bisection result I think.

Here's a brute-force list of historic problems in early code, and an attempt to
check whether those aspects are fine:

1) stack troubles

The bisected-to patch adds one more C function call parameter, and one of the (low
probability) possibilities would be for the initial stack to be overflowing.

But stack setup in setup_64() looks fine to me:

/* Set up the stack for verify_cpu(), similar to initial_stack below */
leaq (__end_init_task - SIZEOF_PTREGS)(%rip), %rsp

/* Sanitize CPU configuration */
call verify_cpu


__end_init_task is defined as:

#define INIT_TASK_DATA(align) \
. = ALIGN(align); \
VMLINUX_SYMBOL(__start_init_task) = .; \
*(.data..init_task) \
VMLINUX_SYMBOL(__end_init_task) = .;


and we set up space for the init task in arch/x86/kernel/vmlinux.lds.S via:

/* Data */
.data : AT(ADDR(.data) - LOAD_OFFSET) {
/* Start of data section */
_sdata = .;

/* init_task */
INIT_TASK_DATA(THREAD_SIZE)

where THREAD_SIZE is at least 16K of space, more on KASAN.

So we put the initial stack PT_REGS below the end of &init_task - which should all
be good and there should be plenty of space.

2)

using global variables, which is unsafe in early code if the kernel is
relocatable.

The bisected to commit uses a new sme_populate_pgd_data to collect variables that
were already on the stack, which should be position independent and safe.

But the other commits use sme_active(), which does:

bool sme_active(void)
{
return sme_me_mask && !sev_enabled;
}
EXPORT_SYMBOL(sme_active);

And that looks PIC-unsafe to me, as both are globals:

u64 sme_me_mask __section(.data) = 0;
EXPORT_SYMBOL(sme_me_mask);

Does the code start working if you force sme_active() to 0 while keeping the
function call, i.e. something like the hack below?

Thanks,

Ingo

arch/x86/mm/mem_encrypt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 3ef362f598e3..52f7db4d08d6 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -403,7 +403,7 @@ int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size)
*/
bool sme_active(void)
{
- return sme_me_mask && !sev_enabled;
+ return 0;
}
EXPORT_SYMBOL(sme_active);