Re: [PATCH 03/42] x86, boot: Fix run_size calculation

From: Kees Cook
Date: Tue Jul 07 2015 - 18:15:53 EST


On Tue, Jul 7, 2015 at 1:19 PM, Yinghai Lu <yinghai@xxxxxxxxxx> wrote:
> While looking at the boot code to add mem mapping for kasl
> with 64bit above 4G support, I found that e6023367d779 ("x86, kaslr: Prevent
> .bss from overlaping initrd") and later introduced way to get kernel run_size
> and pass it around. At first run_size calculation is via perl and then
> changed to shell scripts.
>
> At first, that calculation is not right in the shell scripts:
> it is using bss offset in the file plus bss/brk section size.
>
> run_size=$(( $offsetA + $sizeA + $sizeB ))
>
> Idx Name Size VMA LMA File off Algn
> ...
> 24 .bss 000a1000 ffffffff825e0000 00000000025e0000 019e0000 2**12
> ALLOC
> 25 .brk 00026000 ffffffff82681000 0000000002681000 019e0000 2**0
> ALLOC
>
> that run_size will be 27947008.

In hex, this is 0x1aa7000. What should it be?

> it has extra not needed size as
> 1. we have hole between the sections in file to get aligned in file.
> 2. start of text is from 0x200000 in elf file.
>
> [Nr] Name Type Address Offset
> Size EntSize Flags Link Info Align
> ...
> [25] .bss NOBITS ffffffff825e0000 019e0000
> 00000000000a1000 0000000000000000 WA 0 0 4096
> [26] .brk NOBITS ffffffff82681000 019e0000
> 0000000000026000 0000000000000000 WA 0 0 1
>
> Program Headers:
> Type Offset VirtAddr PhysAddr
> FileSiz MemSiz Flags Align
> LOAD 0x0000000000200000 0xffffffff81000000 0x0000000001000000
> 0x00000000013a9000 0x00000000013a9000 R E 200000
> LOAD 0x0000000001600000 0xffffffff82400000 0x0000000002400000
> 0x00000000000ed000 0x00000000000ed000 RW 200000
> LOAD 0x0000000001800000 0x0000000000000000 0x00000000024ed000
> 0x0000000000013698 0x0000000000013698 RW 200000
> LOAD 0x0000000001901000 0xffffffff82501000 0x0000000002501000
> 0x00000000000df000 0x00000000001a6000 RWE 200000
> NOTE 0x0000000000e9d7dc 0xffffffff81c9d7dc 0x0000000001c9d7dc
> 0x0000000000000024 0x0000000000000024 4
>
> Section to Segment mapping:
> Segment Sections...
> 00 .text .notes ..
> 01 .data .vvar
> 02 .data..percpu
> 03 .init.text ... .bss .brk
> 04 .notes
>
> During decompress_kernel, parse_elf will move forward section to run time position.
>
> parse_elf: [0x009a000000-0x009b3a8fff] <=== [0x009a200000-0x009b5a8fff]
> parse_elf: [0x009b400000-0x009b4ecfff] <=== [0x009b600000-0x009b6ecfff]
> parse_elf: [0x009b4ed000-0x009b500697] <=== [0x009b800000-0x009b813697]
> parse_elf: [0x009b501000-0x009b5dffff] <=== [0x009b901000-0x009b9dffff]

Where is this output from? The first LOAD is at 0x1000000, so I assume
the extra 0x99000000 came from kASLR as the base address?

Looking at Program Header loading, .brk ends at 0x009b9dffff (offset
from 0x009a200000) this is a run_size of 0x17e0000, not 0x1aa7000 as
calculated by the shell script?

I'm having a hard time seeing the holes, though. It looks like the 03
Segment starts at 0x2501000 with a file size of 0xdf000, which matches
the .bss LMA start: 0x25e0000. And the 03 Segment has a MemSiz of
0x1a6000, getting us from 0x2501000 to 0x26a7000, which exactly
matches the end of .brk, which is 0x2681000 + 0x26000 = 0x26a7000.
(And offset from the .text start of 0x1000000, means a run_size of
0x16a7000.)

The missing 0x2c7000 seems to be from various Sections being laid out
differently? Perhaps the shell script should be using LMA (or VMA),
not File offset? .brk LMA 0x2681000 - .text LMA 0x1000000 + .brk size
0x26000 = 0x16a7000, as above.

So, I'm convinced the shell script is calculating something wrong (it
gets 0x1aa7000, but it should be 0x16a7000, off by 0x400000), but I
don't see how this relates to the parse_elf output that suggests the
run_size should be 0x009b9dffff - 0x9a200000 = 0x17e0000. Is 0x16a7000
too low, and if so, why?

> Secondly it is not necessary. As run_size is simple constant, we don't
> need to pass it around and we already have voffset.h for that.
>
> We can share voffset.h between misc.c and header.S instead of adding
> other way to get run_size.

Can you demonstrate that "VO__end - VO__text" matches an expected
value? This isn't obvious to me. I would expect one of 0x1aa7000,
0x16a7000, or 0x17e0000 here.

Thanks! Regardless of my confusion, I would _love_ to get rid of the
external calculation of run_size. It's a really terrible hack, but I
hadn't been able to convince myself anything else was correct. I just
want to make sure we're getting the right value. :)

-Kees

>
> In this patch, we move voffset.h creation code to boot/compressed/Makefile.
>
> Dependence was:
> boot/header.S ==> boot/voffset.h ==> vmlinux
> boot/header.S ==> compressed/vmlinux ==> compressed/misc.c
> Now become:
> boot/header.S ==> compressed/vmlinux ==> compressed/misc.c ==> boot/voffset.h ==> vmlinux
>
> Use macro in misc.c to replace passed run_size.
>
> Fixes: e6023367d779 ("x86, kaslr: Prevent .bss from overlaping initrd")
> Cc: Junjie Mao <eternal.n08@xxxxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx>
> Cc: Matt Fleming <matt.fleming@xxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Yinghai Lu <yinghai@xxxxxxxxxx>
> ---
> arch/x86/boot/Makefile | 11 +----------
> arch/x86/boot/compressed/Makefile | 12 ++++++++++++
> arch/x86/boot/compressed/misc.c | 3 +++
> 3 files changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile
> index 57bbf2f..4d27e8b 100644
> --- a/arch/x86/boot/Makefile
> +++ b/arch/x86/boot/Makefile
> @@ -77,15 +77,6 @@ $(obj)/vmlinux.bin: $(obj)/compressed/vmlinux FORCE
>
> SETUP_OBJS = $(addprefix $(obj)/,$(setup-y))
>
> -sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|_end\)$$/\#define VO_\2 0x\1/p'
> -
> -quiet_cmd_voffset = VOFFSET $@
> - cmd_voffset = $(NM) $< | sed -n $(sed-voffset) > $@
> -
> -targets += voffset.h
> -$(obj)/voffset.h: vmlinux FORCE
> - $(call if_changed,voffset)
> -
> sed-zoffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(startup_32\|startup_64\|efi32_stub_entry\|efi64_stub_entry\|efi_pe_entry\|input_data\|_end\|z_.*\)$$/\#define ZO_\2 0x\1/p'
>
> quiet_cmd_zoffset = ZOFFSET $@
> @@ -97,7 +88,7 @@ $(obj)/zoffset.h: $(obj)/compressed/vmlinux FORCE
>
>
> AFLAGS_header.o += -I$(obj)
> -$(obj)/header.o: $(obj)/voffset.h $(obj)/zoffset.h
> +$(obj)/header.o: $(obj)/zoffset.h
>
> LDFLAGS_setup.elf := -T
> $(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE
> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
> index 0a291cd..d9fee82 100644
> --- a/arch/x86/boot/compressed/Makefile
> +++ b/arch/x86/boot/compressed/Makefile
> @@ -40,6 +40,18 @@ LDFLAGS_vmlinux := -T
> hostprogs-y := mkpiggy
> HOST_EXTRACFLAGS += -I$(srctree)/tools/include
>
> +sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p'
> +
> +quiet_cmd_voffset = VOFFSET $@
> + cmd_voffset = $(NM) $< | sed -n $(sed-voffset) > $@
> +
> +targets += ../voffset.h
> +
> +$(obj)/../voffset.h: vmlinux FORCE
> + $(call if_changed,voffset)
> +
> +$(obj)/misc.o: $(obj)/../voffset.h
> +
> vmlinux-objs-y := $(obj)/vmlinux.lds $(obj)/head_$(BITS).o $(obj)/misc.o \
> $(obj)/string.o $(obj)/cmdline.o \
> $(obj)/piggy.o $(obj)/cpuflags.o
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index ebf72ce..a88b591 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -11,6 +11,7 @@
>
> #include "misc.h"
> #include "../string.h"
> +#include "../voffset.h"
>
> /* WARNING!!
> * This code is compiled with -fPIC and it is relocated dynamically
> @@ -393,6 +394,8 @@ asmlinkage __visible void *decompress_kernel(void *rmode, memptr heap,
> lines = real_mode->screen_info.orig_video_lines;
> cols = real_mode->screen_info.orig_video_cols;
>
> + run_size = VO__end - VO__text;
> +
> console_init();
> debug_putstr("early console in decompress_kernel\n");
>
> --
> 1.8.4.5
>



--
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/