Re: [PATCH v3 11/12] x86, boot: add fields to support load bzImageand ramdisk high

From: H. Peter Anvin
Date: Sat Nov 24 2012 - 19:05:00 EST


On 11/24/2012 03:50 PM, Eric W. Biederman wrote:

It was conservative at the time the code was introduced and it most
definitely is not wrong. The code predates the verbage in boot.txt.
Apparently no one bothered to see what /sbin/kexec was actually doing
when they documented the 32bit boot loader interface. I was under the
impression that it was actual practice that was documented but in this
particular something else was documented instead. Since /sbin/kexec did
not need any of the more recent features we simply have not noticed it
until now.


The problem is that kexec and others didn't follow any protocol at all, but rather did something that happened to work... but could trivially be shown had no way of being forward compatible.

We could work around it with a sentinel hack... except you *also*
probably modify *some* fields and now we have a horrid mix of
initialized and uninitialized fields to sort out... and there really
isn't any sane way for the kernel to sort that out.

We have a huge problem on our hands now because of it.

So, given the mess we now have on our hands... any suggestions how to best solve
it? There is the option of simply declaring old kexec binaries broken; they
will then not work reliably with newer kernels, if they even work reliably now
-- it is hard to know for certain.

I believe all added variables between the last version of the boot
protocol /sbin/kexec knows about and the current time were added in the
initialized data section. Certainly we can check and that will tell us
how likely changes in arch/x86/boot/ have been regressions in the 32bit
entry point support.

As for solving this there is a simple solution. Add a second jump
right after the first jump. The variables after the second jump can
all be zero initialized.

It doesn't work for the variables *before* the initialized section, and that is actually where we have most problems... there really are only very few bytes left after the initialized section. The reason we can't do anything about the area before it is because that has to have stuff in it, like the EFI header, to work.

And if we really care about breaking other boot loaders we can take a
survey and actually look and see what they do. There really aren't that
many x86 boot loaders.

There are more than you think... a lot of them are hiding in grotty corners. However, they are minority users.

It sounds like we are leaning toward some form of the sentinel hack, which means we need an enumerated list of things that should *not* be zeroed if the sentinel is present.

The option of declaring the list frozen makes me a bit nervous, because it isn't clear that we don't already have fields that will be misinterpreted by the kernel if filled in from the file.

-hpa


--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/