Re: [PATCH 1/5] x86, KASLR: Update description for decompressor worst case size

From: Baoquan He
Date: Thu Apr 21 2016 - 23:13:33 EST


On 04/21/16 at 01:04pm, Kees Cook wrote:
> On Thu, Apr 21, 2016 at 7:47 AM, Borislav Petkov <bp@xxxxxxx> wrote:
> > On Wed, Apr 20, 2016 at 01:55:42PM -0700, Kees Cook wrote:
> > ...
> > What's "non compressed data overhead"?
> >
> > Does that want to say:
> >
> > "... resulting in 18 bytes overhead of uncompressed data."
> >
> > perhaps?
>
> Yeah, that reads much more clearly. I'll change it.
>
> >> +#
> >> +# Files divided into blocks
> >> +# 1 bit (last block flag)
> >> +# 2 bits (block type)
> >> +#
> >> +# 1 block occurs every 32K -1 bytes or when there 50% compression
> >> +# has been achieved. The smallest block type encoding is always used.
> >> +#
> >> +# stored:
> >> +# 32 bits length in bytes.
> >> +#
> >> +# fixed:
> >> +# magic fixed tree.
> >> +# symbols.
> >> +#
> >> +# dynamic:
> >> +# dynamic tree encoding.
> >> +# symbols.
> >> +#
> >> +#
> >> +# The buffer for decompression in place is the length of the uncompressed
> >> +# data, plus a small amount extra to keep the algorithm safe. The
> >> +# compressed data is placed at the end of the buffer. The output pointer
> >> +# is placed at the start of the buffer and the input pointer is placed
> >> +# where the compressed data starts. Problems will occur when the output
> >> +# pointer overruns the input pointer.
> >> +#
> >> +# The output pointer can only overrun the input pointer if the input
> >> +# pointer is moving faster than the output pointer. A condition only
> >> +# triggered by data whose compressed form is larger than the uncompressed
> >> +# form.
> >> +#
> >> +# The worst case at the block level is a growth of the compressed data
> >> +# of 5 bytes per 32767 bytes.
> >> +#
> >> +# The worst case internal to a compressed block is very hard to figure.
> >> +# The worst case can at least be bounded by having one bit that represents
> >> +# 32764 bytes and then all of the rest of the bytes representing the very
> >> +# very last byte.
> >> +#
> >> +# All of which is enough to compute an amount of extra data that is required
> >> +# to be safe. To avoid problems at the block level allocating 5 extra bytes
> >> +# per 32767 bytes of data is sufficient. To avoid problems internal to a
> >> +# block adding an extra 32767 bytes (the worst case uncompressed block size)
> >> +# is sufficient, to ensure that in the worst case the decompressed data for
> >> +# block will stop the byte before the compressed data for a block begins.
> >> +# To avoid problems with the compressed data's meta information an extra 18
> >> +# bytes are needed. Leading to the formula:
> >> +#
> >> +# extra_bytes = (uncompressed_size >> 12) + 32768 + 18 + decompressor_size
> >> +#
> >> +# Adding 8 bytes per 32K is a bit excessive but much easier to calculate.
> >> +# Adding 32768 instead of 32767 just makes for round numbers.
> >> +# Adding the decompressor_size is necessary as it musht live after all
> >> +# of the data as well. Last I measured the decompressor is about 14K.
> >> +# 10K of actual data and 4K of bss.
> >
> > I guess reflow the paragraphs while at it, as well?
> >
> > "Adding 8 bytes per 32K is a bit excessive but much easier to calculate.
> > Adding 32768 instead of 32767 just makes for round numbers. Adding the
> > decompressor_size is necessary as it musht live after all of the data as
~~must
There's a typo here, it should be 'must'. I didn't notice before. It
might be fixed when take Boris's paragraph reflowing. :)

> > well. Last I measured the decompressor is about 14K. 10K of actual data
> > and 4K of bss."
> >
> > and so on...
>
> Yeah, I'd been reflowing as I went and I went back and forth on that
> one. It looked like it was a list ("Adding... Adding... Adding...") so
> I'd left it, but my initial instinct matches your: it should just get
> reflowed like all the rest.
>
> I'll fix that too.
>
> Thanks for the review!
>
> -Kees
>
> --
> Kees Cook
> Chrome OS & Brillo Security