Re: [PATCH] Fix bss mapping for the interpreter in binfmt_elf

From: Kees Cook
Date: Fri Jul 08 2016 - 17:10:55 EST


On Mon, Jul 4, 2016 at 1:01 PM, Hector Marco-Gisbert <hecmargi@xxxxxx> wrote:
>> On Wed, May 11, 2016 at 3:37 AM, Hector Marco-Gisbert <hecmargi@xxxxxx> wrote:
>>> While working on a new ASLR for userspace we detected an error in the
>>> interpret loader.
>>>
>>> The size of the bss section for some interpreters is not correctly
>>> calculated resulting in unnecessary calls to vm_brk() with enormous size
>>> values.
>>>
>>> The bug appears when loading some interpreters with a small bss size. Once
>>> the last loadable segment has been loaded, the bss section is zeroed up to
>>> the page boundary and the elf_bss variable is updated to this new page
>>> boundary. Because of this update (alignment), the last_bss could be less
>>> than elf_bss and the subtraction "last_bss - elf_bss" value could overflow.
>>>
>>> Although it is quite easy to check this error, it has not been manifested
>>> because some peculiarities of the bug. Following is a brief description:
>>
>> Heh, the bugs cancel each other out! :P
>>
>>>
>>>
>>> $ size /lib32/ld-2.19.so
>>> text data bss dec hex filename
>>> 128456 2964 192 131612 2021c /lib32/ld-2.19.so
>>>
>>>
>>> An execution example with:
>>> - last_bss: 0xb7794938
>>> - elf_bss: 0xb7794878
>>>
>>>
>>> From fs/binfmt_elf.c:
>>> ---------------------------------------------------------------------------
>>> if (last_bss > elf_bss) {
>>> /*
>>> * Now fill out the bss section. First pad the last page up
>>> * to the page boundary, and then perform a mmap to make sure
>>> * that there are zero-mapped pages up to and including the
>>> * last bss page.
>>> */
>>> if (padzero(elf_bss)) {
>>> error = -EFAULT;
>>> goto out;
>>> }
>>
>> Tangent: shouldn't this padzero() call happen even in the case where
>> last_bss <= elf_bss? The elf_map call has total_size bumped up to page
>> size, so junk may have been loaded from the file still, even if
>> last_bss == elf_bss, etc. Maybe I'm missing something.
>>
>
> When elf_map() is called the first time (with total_size !=0) a mmap() call
> with the full image size is performed and immediately (in the same elf_map()
> call) an unmap() is done in order to have only the first segment loadable
> mapped. So, the total_size here is used to ensure enough space for the elf
> (avoid overlapping because of the randomization).
>
> Subsequent calls to elf_map() (per loadable segment) sets total_size to zero
> resulting in mmaps of p_filesz size. This p_filesz is bumped up to ELF page
> size which causes to load junk from the file. I think this is intended to
> preserve the memory-mapped file benefits, so no calls to padzero() are made for
> these maps.
>
> Later, after loading all segments, if last_bss > elf_bss (memsz > filesz of the
> last segment) then we need to call padzero(elf_bss) to fill the bss with zeros,
> otherwise the bss could contain junk.

Right, though it does seem weird to me to leave junk in memory.
Regardless, this would only fix the last case, not the others, so I
agree: it should stay.

> The problem of junk occurs at the end of each loadable segment (as far at it
> does not end at page boundary). And AFAIK it has not been considered to be a
> problem, because the junk comes from the file itself, and so no memory leak.
> Typically, the last segment contains the bss (which must be zeroed). Only the
> bss must be zeroed.
>
> I don't see why it is necessary to call padzero() when there is no bss section
> (last_bss == elf_bss). Same for (last_bss <= elf_bss) which is a symptom of an
> error (intentional or not), because the p_filesz must always be <= p_memsz.
>
>
>>> /* What we have mapped so far */
>>> elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);
>>>
>>> <---------- elf_bss here is 0xb7795000
>>>
>>> /* Map the last of the bss segment */
>>> error = vm_brk(elf_bss, last_bss - elf_bss); <-- overflow!
>>> if (BAD_ADDR(error))
>>> goto out;
>>> }
>>>
>>> error = load_addr;
>>> out:
>>> return error;
>>> }
>>> ---------------------------------------------------------------------------
>>>
>>>
>>> The size value requested to the vm_brk() call (last_bss - elf_bss) is
>>> 0xfffffffffffff938 and internally this size is page aligned in the do_brk()
>>> function resulting in a 0 length request.
>>>
>>> static unsigned long do_brk(unsigned long addr, unsigned long len)
>>> {
>>> ...
>>> len = PAGE_ALIGN(len);
>>
>> I feel like do_brk should detect wrap-around and error out...
>
> Yes!
>
>>
>>> if (!len)
>>> return addr;
>>>
>>>
>>> Since a segment of 0 bytes is perfectly valid, it returns the requested
>>> address to vm_brk() and because it is page aligned (done by the previous
>>> line to the vm_brk() call the "error" is not detected by
>>> "BAD_ADDR(error)" and the "load_elf_interp()" functions does not
>>> returns any error. Note that vm_brk() is not necessary at all.
>>>
>>> In brief, if the end of the bss is in the same page than the last segment
>>> loaded then the size of the last of bss segment is incorrectly calculated.
>>>
>>>
>>> This patch set up to the page boundary of the last_bss variable and only do
>>> the vm_brk() call when necessary.
>>>
>>>
>>> Signed-off-by: Hector Marco-Gisbert <hecmargi@xxxxxx>
>>> Acked-by: Ismael Ripoll Ripoll <iripoll@xxxxxx>
>>> ---
>>> fs/binfmt_elf.c | 12 ++++++++----
>>> 1 file changed, 8 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
>>> index 81381cc..acfbdc2 100644
>>> --- a/fs/binfmt_elf.c
>>> +++ b/fs/binfmt_elf.c
>>> @@ -526,7 +526,7 @@ static unsigned long load_elf_interp(struct elfhdr
> *interp_elf_ex,
>>> int load_addr_set = 0;
>>> unsigned long last_bss = 0, elf_bss = 0;
>>> unsigned long error = ~0UL;
>>> - unsigned long total_size;
>>> + unsigned long total_size, size;
>>> int i;
>>>
>>> /* First of all, some simple consistency checks */
>>> @@ -626,11 +626,15 @@ static unsigned long load_elf_interp(struct elfhdr
> *interp_elf_ex,
>>>
>>> /* What we have mapped so far */
>>> elf_bss = ELF_PAGESTART(elf_bss + ELF_MIN_ALIGN - 1);
>>
>> This math is just ELF_PAGEALIGN(elf_bss)...
>>
>>> + last_bss = ELF_PAGESTART(last_bss + ELF_MIN_ALIGN - 1);
>>
>> I don't think this is correct, actually. It works for the
>> less-than-a-single-page case, but not for the intended case. I don't
>> like that this hides the PAGE_ALIGN() (on len) and PAGE_SHIFT (on
>> addr) that happens inside do_brk either, and this will break if
>> ELF_MIN_ALIGN != PAGE_SIZE.
>
> I don't see your point. ELF_MIN_ALIGN is always greater or equal than PAGE_SIZE
> and elf_map() calls align the segment size to ELF page. Also padzero() uses
> ELF_MIN_ALIGN to calculate the number of bytes to clear.

Mostly I just didn't like all the hidden side-effects, so it bears
cleaning up regardless.

>>> /* Map the last of the bss segment */
>>> - error = vm_brk(elf_bss, last_bss - elf_bss);
>>> - if (BAD_ADDR(error))
>>> - goto out;
>>> + size = last_bss - elf_bss;
>>> + if (size) {
>>> + error = vm_brk(elf_bss, size);
>>> + if (BAD_ADDR(error))
>>> + goto out;
>>> + }
>>
>> So, I think what's needed here are three patches:
>>
>> 1) move padzero() before the last_bss > elf_bss case, since the
>> zero-filling of the ELF_PAGE should have nothing to do with the
>> relationship of last_bss and elf_bss. AIUI, it's all about the
>> relationship of elf_bss to the ELF_PAGE size.
>
> As I pointed above I'm not sure of this.
>
>>
>> 2) handle the math on elf_bss vs last_bss correctly. I think this
>> requires that both be ELF_PAGE aligned, since that's the expected
>> granularity of the mappings. elf_bss already had alignment-based
>> padding happen in, so the "start" of the vm_brk should be moved
>> forward as done in the original code. However, since the "end" of the
>> vm_brk will already be PAGE_ALIGNed then last_bss should get aligned
>> here to avoid hiding it. Something like:
>>
>> elf_bss = ELF_PAGEALIGN(elf_bss);
>> last_bss = ELF_PAGEALIGN(last_bss);
>>
>> And leave the vm_brk as-is since it already checks for size==0.
>
> Yes, it is much clearer to use ELF_PAGEALIGN.
>
>>
>> 3) add code to do_brk() to check for these kinds of overflows, which
>> shouldn't happen:
>>
>> static int do_brk(unsigned long addr, unsigned long request)
>> {
>> ...
>> unsigned long len = PAGE_ALIGN(request);
>>
>> if (len < request) {
>> WARN_ONCE(...);
>> return -ENOMEM;
>> }
>> if (!len)
>> return 0;
>> ...
>>
>>
>> How's that sound?
>
>
> It sounds good :). Another possibility would be:
>
> static int do_brk(unsigned long addr, unsigned long len)
> {
> ...
> if (len > TASK_SIZE) {
> WARN_ONCE(...);
> return -ENOMEM;
> }
> ...

Hm, maybe (len < request || len > TASK_SIZE) ?

I'll give it a spin.

-Kees

>
>
>
> Hector Marco.
>
>
>>
>> -Kees
>>
>
> --
> Dr. Hector Marco-Gisbert @ http://hmarco.org/
> Cyber Security Researcher @ http://cybersecurity.upv.es
> Universitat PolitÃcnica de ValÃncia (Spain)
>



--
Kees Cook
Chrome OS & Brillo Security