Re: [PATCH 1/2] x86/efi: Map EFI memmap entries in-order at runtime

From: James Bottomley
Date: Wed Sep 30 2015 - 13:26:25 EST


On Wed, 2015-09-30 at 09:43 -0700, Andy Lutomirski wrote:
> On Wed, Sep 30, 2015 at 2:30 AM, Ard Biesheuvel
> <ard.biesheuvel@xxxxxxxxxx> wrote:
> > On 29 September 2015 at 23:58, Laszlo Ersek <lersek@xxxxxxxxxx> wrote:
> >> On 09/28/15 08:41, Matthew Garrett wrote:
> >>> On Mon, Sep 28, 2015 at 08:16:46AM +0200, Ingo Molnar wrote:
> >>>
> >>>> So the question is, what does Windows do?
> >>>
> >>> It's pretty trivial to hack OVMF to dump the SetVirtualAddressMap()
> >>> arguments to the qemu debug port. Unfortunately I'm about to drop
> >>> mostly offline for a week, otherwise I'd give it a go...
> > [...]
> >> Then I booted my Windows Server 2012 R2, Windows 8.1, and Windows 10
> >> guests, with the properties table feature enabled vs. disabled in the
> >> firmware. (All three Windows guests were updated first though.)
> >>
> >> All three Windows OSes adapt their SetVirtualAddressMap() calls, when
> >> the feature is enabled in the firmware. However, Windows 8.1 crashes
> >> nonetheless (BSOD, I forget the fault details, sorry). Windows Server
> >> 2012 R2 and Windows 10 boot fine.
> >>
> >
> > Looking at the log, it seems the VA mapping strategy is actually the
> > same (i.e., bottom-up for Win10), and the difference can be explained
> > by the differences in the memory map provided by the firmware to the
> > OS. And indeed, the Win8.1 log shows the following:
> >
> > # MemType Phys 0x Virt 0x Size 0x Attributes
> > -- ------- -------- -------- ------- -------------------------------
> > 0 RtData 7EC21000 FFBFA000 0006000 [UC|WC|WT|WB| |XP| | | |RT]
> > 1 RtCode 7EC27000 FFBF3000 0007000 [UC|WC|WT|WB| | |RO| | |RT]
> > 2 RtData 7EC2E000 FFBEC000 0007000 [UC|WC|WT|WB| |XP| | | |RT]
> > 3 RtData 7EC35000 FFBEB000 0001000 [UC|WC|WT|WB| |XP| | | |RT]
> > 4 RtCode 7EC36000 FFBE6000 0005000 [UC|WC|WT|WB| | |RO| | |RT]
> > 5 RtData 7EC3B000 FFBE4000 0002000 [UC|WC|WT|WB| |XP| | | |RT]
> > 6 RtData 7EC60000 FFBDE000 0006000 [UC|WC|WT|WB| |XP| | | |RT]
> > 7 RtCode 7EC66000 FFBD5000 0009000 [UC|WC|WT|WB| | |RO| | |RT]
> > 8 RtData 7EC6F000 FFBD3000 0002000 [UC|WC|WT|WB| |XP| | | |RT]
> > 9 RtData 7EC9E000 FFAFA000 00D9000 [UC|WC|WT|WB| |XP| | | |RT]
> > 10 RtCode 7ED77000 FFA63000 0097000 [UC|WC|WT|WB| | |RO| | |RT]
> > 11 RtData 7EE0E000 FFA58000 000B000 [UC|WC|WT|WB| |XP| | | |RT]
> > 12 RtData 7FE99000 FFA52000 0006000 [UC|WC|WT|WB| |XP| | | |RT]
> > 13 RtCode 7FE9F000 FFA4C000 0006000 [UC|WC|WT|WB| | |RO| | |RT]
> > 14 RtData 7FEA5000 FFA49000 0003000 [UC|WC|WT|WB| |XP| | | |RT]
> > 15 RtCode 7FEA8000 FFA42000 0007000 [UC|WC|WT|WB| | |RO| | |RT]
> > 16 RtData 7FEAF000 FFA3F000 0003000 [UC|WC|WT|WB| |XP| | | |RT]
> > 17 RtCode 7FEB2000 FFA36000 0009000 [UC|WC|WT|WB| | |RO| | |RT]
> > 18 RtData 7FEBB000 FFA33000 0003000 [UC|WC|WT|WB| |XP| | | |RT]
> > 19 RtCode 7FEBE000 FFA2A000 0009000 [UC|WC|WT|WB| | |RO| | |RT]
> > 20 RtData 7FEC7000 FFA04000 0026000 [UC|WC|WT|WB| |XP| | | |RT]
> > 21 RtData 7FFD0000 FF9E4000 0020000 [UC|WC|WT|WB| |XP| | | |RT]
> > 22 RtData FFE00000 FF7E4000 0200000 [UC| | | | |XP| | | |RT]
> >
> > I.e., the physical addresses increase while the virtual addresses
> > decrease, and since each consecutive RuntimeCode/RuntimeData pair
> > constitutes a PE/COFF image (.text and .data, respectively), the
> > PE/COFF images appear corrupted in the virtual space.
>
> All of this garbage makes me want to ask a rhetorical question:
>
> Why on Earth did anyone think it's a good idea to invoke EFI functions
> at CPL0 once the OS is booted?

I'm afraid the originators of EFI (Intel) look on it as a DOS
replacement ... with the same OS support.

> And a more practical question:
>
> Do we actually have to invoke EFI functions at CPL0?
>
> I really mean it. Sure, for things like reboot where we give up
> control and don't get it back, we need to do that. But for things
> like variable access, the EFI code should really only need access to
> EFI memor (with a known PA -> VA map) and the ability to trigger an
> SMI. Doing it at CPL3 could require more fixups than would really
> make sense, but could we virtualize it instead?
>
> Actually, CPL3 + IOPL3 just might work.
>
> Heck, on mixed-mode, we're already invoke EFI functions in compat
> mode, and that seems okay, so those functions can't be poking at any
> CPU state that varies between long and 32-bit modes.

It's hard. The EFI functions expect to interact directly with kernel
memory, which they can't at CPL3. We could vector all that through a
CPL3 readable buffer but anything within EFI that uses privileged
instructions will fault and we'll have to handle it ... this really
sounds like a can of worms. Especially as windows will be no help
testing all of this because it will call in at CPL0.

James

N‹§²æ¸›yú²X¬¶ÇvØ–)Þ{.nlj·¥Š{±‘êX§¶›¡Ü}©ž²ÆzÚj:+v‰¨¾«‘êZ+€Êzf£¢·hšˆ§~†­†Ûÿû®w¥¢¸?™¨è&¢)ßf”ùy§m…á«a¶Úÿ 0¶ìå