Re: [PATCH 4.4 00/37] 4.4.110-stable review
From: Thomas Gleixner
Date: Thu Jan 11 2018 - 18:03:28 EST
On Thu, 11 Jan 2018, Thomas Gleixner wrote:
> On Thu, 11 Jan 2018, Thomas Gleixner wrote:
> > On Thu, 11 Jan 2018, Linus Torvalds wrote:
> >
> > > On Thu, Jan 11, 2018 at 12:37 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> > > >
> > > > 67a9108ed431 ("x86/efi: Build our own page table structures")
> > > >
> > > > got rid of EFI depending on real_mode_header->trampoline_pgd
> > >
> > > So I think it only got rid of by default - the codepath is still
> > > there, the allocation is still there, it's just that it's not actually
> > > used unless somebody does that "efi=old_mmap" thing.
> >
> > Yes, the trampoline_pgd is still around, but I can't figure out how it
> > would be used after boot. Confused, digging more.
>
> So coming back to the same commit. From the changelog:
>
> This is caused by mapping EFI regions with RWX permissions.
> There isn't much we can do to restrict the permissions for these
> regions due to the way the firmware toolchains mix code and
> data, but we can at least isolate these mappings so that they do
> not appear in the regular kernel page tables.
>
> In commit d2f7cbe7b26a ("x86/efi: Runtime services virtual
> mapping") we started using 'trampoline_pgd' to map the EFI
> regions because there was an existing identity mapping there
> which we use during the SetVirtualAddressMap() call and for
> broken firmware that accesses those addresses.
>
> So this very commit gets rid of the (ab)use of trampoline_pgd and allocates
> efi_pgd, which we made use the proper size.
>
> trampoline_pgd is since then only used to get into long mode in
> realmode/rm/trampoline_64.S and for reboot in machine_real_restart().
>
> The runtime services stuff does not use it in kernel versions >= 4.6
But there is one very well hidden user for it after boot:
It's used for booting secondary CPUs from real mode
So the transition to long mode for secondaries uses the trampoline pgd for
long mode transition and then jumping to secondary_startup_64 where CR3 is
set to the real kernel page tables.
Thanks,
tglx