Re: [PATCH v5 00/27] x86_64: Improvements at compressed kernel stage

From: Andy Lutomirski
Date: Tue Mar 14 2023 - 19:21:31 EST




On Tue, Mar 14, 2023, at 2:23 PM, Andy Lutomirski wrote:
> On Tue, Mar 14, 2023, at 3:13 AM, Evgeniy Baskov wrote:
>>
>> Kernel is made to be more compatible with PE image specification [3],
>> allowing it to be successfully loaded by stricter PE loader
>> implementations like the one from [2]. There is at least one
>> known implementation that uses that loader in production [4].
>> There are also ongoing efforts to upstream these changes.
>
> Can you clarify

Sorry, lost part of a sentence. Can you clarify in what respect the loader is stricter?


Anyway, I did some research. I found:

https://github.com/rhboot/shim/pull/459/commits/99a8d19326f69665e0b86bcfa6a59d554f662fba

which gives a somewhat incoherent-sounding description in which setting EFI_IMAGE_DLLCHARACTERISTICS_NX_COMPAT apparently enables allocating memory that isn't RWX. But this seems odd EFI_IMAGE_DLLCHARACTERISTICS_NX_COMPAT is a property of the EFI *program*, not the boot services implementation. And I'd be surprised if a flag on the application changes the behavior of boot services, but, OTOH, this is Microsoft.

And the PE 89 spec does say that EFI_IMAGE_DLLCHARACTERISTICS_NX_COMPAT means "Image is NX compatible" and that is the sole mention of NX in the document.

And *this* seems to be the actual issue:

https://github.com/rhboot/shim/pull/459/commits/825d99361b4aaa16144392dc6cea43e24c8472ae

I assume that MS required this change as a condition for signing, but what do I know? Anyway, the rules appear to be that the PE sections must not be both W and X at the same size. (For those who are familiar with the abomination known as ELF but not with the abomination known as PE, a "section" is a range in the file that gets mapped into memory. Like a PT_LOAD segment in ELF.)

Now I don't know whether anything prevents us from doing something awful like mapping the EFI stuf RX and then immediately *re*mapping everything RWX. (Not that I'm seriously suggesting that.) And it's not immediately clear to me how the rest of this series fits in, what this has to do with the identity map, etc.

Anyway, I think the series needs to document what's going on, in the changelog and relevant comments. And if the demand-population of the identity map is a problem, then there should be a comment like (made up -- don't say this unless it's correct):

A sufficiently paranoid EFI implementation may enforce W^X when mapping memory through the boot services protocols. And creating identity mappings in the page fault handler needs to use the boot services protocols to do so because [fill this in] [or it would be a bit of an abomination to do an end run around them by modifying the page tables ourselves] [or whatever is actually happening]. While we *could* look at the actual fault type and create an R or RW or RX mapping as appropriate, it's better to figure out what needs to be mapped for real and to map it with the correct permissions before faulting.

But I still think we should keep the demand-faulting code as a fallback, even if it's hardcoded as RW, and just log the fault mode and address. We certainly shouldn't be *executing* code that wasn't identity mapped. Unless that code is boot services and we're creating the boot services mappings!

For that matter, how confident are we that there aren't crappy boot services implementations out there that require that we fix up page faults? After all, it's not like EFI implementations, especially early ones, are any good.

--Andy