Re: [patch V163 27/51] x86/mm/pti: Populate user PGD

From: Peter Zijlstra
Date: Mon Dec 18 2017 - 15:41:44 EST


On Mon, Dec 18, 2017 at 12:34:22PM -0800, Dave Hansen wrote:
> On 12/18/2017 03:42 AM, Thomas Gleixner wrote:
> > --- a/arch/x86/include/asm/pgtable.h
> > +++ b/arch/x86/include/asm/pgtable.h
> > @@ -1120,6 +1120,11 @@ static inline void pmdp_set_wrprotect(st
> > static inline void clone_pgd_range(pgd_t *dst, pgd_t *src, int count)
> > {
> > memcpy(dst, src, count * sizeof(pgd_t));
> > +#ifdef CONFIG_PAGE_TABLE_ISOLATION
> > + /* Clone the user space pgd as well */
> > + memcpy(kernel_to_user_pgdp(dst), kernel_to_user_pgdp(src),
> > + count * sizeof(pgd_t));
> > +#endif
> > }
>
> I was just thinking about this as I re-write the documentation about
> where the overhead of pti comes from.

The think I thought of when I saw this earlier today was that this could
trivially be wrapped in a static_cpu_has(X86_FEATURE_PTI).

> This obviously *works* for now. But, we certainly have the pti-mapped
> stuff spread much less through the address space than when this was
> thrown in here. It *seems* like we could probably do this with just 4 PGDs:
>
> > pti_clone_user_shared();
> > pti_clone_entry_text();
> > pti_setup_espfix64();
> > pti_setup_vsyscall();
>
> The vsyscall is just one page and the espfix is *sized* to be one PGD,
> so we know each of those only takes one entry.
>
> We surely don't have 512GB of entry_text, and I don't think KASLR can
> ever cause it to span two PGD entries.
>
> I also don't think the user_shared area of the fixmap can get *that*
> big. Does anybody know offhand what the theoretical limits are there?

Problem there is the nr_cpus term I think, we currently have up to 8k
CPUs, but I can see that getting bigger in the future.