Re: [PATCH v5 1/2] x86_64: remove not needed clear_page for init_level4_page

From: Borislav Petkov
Date: Tue Jun 16 2015 - 07:16:53 EST


On Tue, Jun 09, 2015 at 12:42:00PM +0300, Alexander Popov wrote:
> From: Andrey Ryabinin <a.ryabinin@xxxxxxxxxxx>
>
> Commit 8170e6bed465 ("x86, 64bit: Use a #PF handler to materialize
> early mappings on demand") introduced clear_page(init_level4_pgt);
> call in x86_64_start_kernel(). However, this clear_page is useless
> because init_level4_page already filled with zeroes in head_64.S

I really doubt that, see below:

> Commit message in 8170e6bed465 says that this clear_page() was
> dropped in v7, but it accidentally reappeared in later versions
> of that patchset.
>
> Signed-off-by: Andrey Ryabinin <a.ryabinin@xxxxxxxxxxx>
> ---
> arch/x86/kernel/head64.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 5a46681..6a6eefd 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -177,7 +177,6 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
> */
> load_ucode_bsp();
>
> - clear_page(init_level4_pgt);
> /* set init_level4_pgt kernel high mapping*/
> init_level4_pgt[511] = early_level4_pgt[511];

vmlinux has:

ffffffff81a00000 <init_level4_pgt>:
ffffffff81a00000: 63 10 movslq (%rax),%edx
ffffffff81a00002: a0 01 00 00 00 00 00 movabs 0x1,%al
ffffffff81a00009: 00 00
...
ffffffff81a0087f: 00 63 10 add %ah,0x10(%rbx)
ffffffff81a00882: a0 01 00 00 00 00 00 movabs 0x1,%al
ffffffff81a00889: 00 00
...
ffffffff81a00ff7: 00 67 30 add %ah,0x30(%rdi)
ffffffff81a00ffa: a0 01 00 00 00 00 63 movabs 0xa020630000000001,%al
ffffffff81a01001: 20 a0

ffffffff81a01000 <level3_ident_pgt>:
ffffffff81a01000: 63 20 movslq (%rax),%esp
ffffffff81a01002: a0 01 00 00 00 00 00 movabs 0x1,%al
ffffffff81a01009: 00 00
...

Booting a kvm quest and halting it before the clear_page:

---
/*
* Load microcode early on BSP.
*/
load_ucode_bsp();

// clear_page(init_level4_pgt);

while(1)
cpu_relax();

/* set init_level4_pgt kernel high mapping*/
init_level4_pgt[511] = early_level4_pgt[511];
---

and dumping the memory at ffffffff81a00000 gives:

---
00000000 63 10 a0 01 00 00 00 00 00 00 00 00 00 00 00 00 |c...............|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000880 63 10 a0 01 00 00 00 00 00 00 00 00 00 00 00 00 |c...............|
00000890 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000ff0 00 00 00 00 00 00 00 00 67 30 a0 01 00 00 00 00 |........g0......|
00001000
---

These are basically the PTE entries it gets for the CONFIG_XEN case
which we clear because we're on baremetal.

Now my hunch is that those entries get overwritten later but I wouldn't
want to debug any strange bugs from leftovers in init_level4_pgt so
having the clear_page() is actually a good thing.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/