Re: [PATCH PTI v2 0/6] Clean up pgd handling and fix VSYSCALL and LDT

From: Kirill A. Shutemov
Date: Tue Dec 12 2017 - 11:01:35 EST


On Sun, Dec 10, 2017 at 10:54:38PM -0800, Andy Lutomirski wrote:
> On Sun, Dec 10, 2017 at 10:47 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> > I'm getting reasonably happy with this. It still needs more testing,
> > but I want to get it out there.
> >
> > The main things that need testing are the 5-level case for the both
> > vsyscalls and the LDT. I'm getting a double-fault in ldt_gdt_64,
> > but I haven't checked whether it's a bug in this series, and it
> > kind of looks like it isn't. I'll figure it out in the morning.
> > The docs also want updating for the 5 level case.
> >
>
> The actual failure looks a lot like the ESPFIX stacks aren't mapped in
> the usermode tables, which reinforces my old belief that this bit of
> code is bogus:
>
> /*
> * Just copy the top-level PGD that is mapping the espfix area to
> * ensure it is mapped into the user page tables.
> *
> * For 5-level paging, the espfix pgd was populated when
> * pti_init() pre-populated all the pgd entries. The above
> * p4d_alloc() would never do anything and the p4d_populate() would
> * be done to a p4d already mapped in the userspace pgd.
> */
> #ifdef CONFIG_PAGE_TABLE_ISOLATION
> if (CONFIG_PGTABLE_LEVELS <= 4) {
> set_pgd(kernel_to_user_pgdp(pgd),
> __pgd(_KERNPG_TABLE | (p4d_pfn(*p4d) << PAGE_SHIFT)));
> }
> #endif
>
> Of course, the comment is even more wrong with this series applied,
> but I think it's been wrong all along.
>
> I'll fix it in the morning if no one beats me to it.
>
> (Hint: this can be tested in QEMU with -machine accel=tcg -cpu qemu64,+la57)

I thought something like patch below would help (for both paging modes). It
indeed made sigreturn_64 run a bit longer in 5-level paging mode, but it still
double faults in the end.

Maybe it's another bug. I don't know. I don't have time right now to look into.

BTW, sigreturn_64 reports FAILS in 4-level paging mode.

t a/arch/x86/kernel/espfix_64.c b/arch/x86/kernel/espfix_64.c
index 1c44e72ed1bc..0725c04b2d3f 100644
--- a/arch/x86/kernel/espfix_64.c
+++ b/arch/x86/kernel/espfix_64.c
@@ -129,20 +129,11 @@ void __init init_espfix_bsp(void)
p4d = p4d_alloc(&init_mm, pgd, ESPFIX_BASE_ADDR);
p4d_populate(&init_mm, p4d, espfix_pud_page);

- /*
- * Just copy the top-level PGD that is mapping the espfix area to
- * ensure it is mapped into the user page tables.
- *
- * For 5-level paging, the espfix pgd was populated when
- * pti_init() pre-populated all the pgd entries. The above
- * p4d_alloc() would never do anything and the p4d_populate() would
- * be done to a p4d already mapped in the userspace pgd.
- */
#ifdef CONFIG_PAGE_TABLE_ISOLATION
- if (CONFIG_PGTABLE_LEVELS <= 4) {
- set_pgd(kernel_to_user_pgdp(pgd),
- __pgd(_KERNPG_TABLE | (p4d_pfn(*p4d) << PAGE_SHIFT)));
- }
+ /* Install the espfix pud into user space page tables too */
+ pgd = kernel_to_user_pgdp(pgd);
+ p4d = p4d_alloc(&init_mm, pgd, ESPFIX_BASE_ADDR);
+ p4d_populate(&init_mm, p4d, espfix_pud_page);
#endif

/* Randomize the locations */
--
Kirill A. Shutemov