Re: [RFC][PATCH] Make cryptoapi non-optional?

From: Jamie Lokier
Date: Fri Aug 15 2003 - 17:02:39 EST


Matt Mackall wrote:
> Val left out the assumption that x and y are already perfectly
> distributed, in and of themselves.

That is equivalent to saying all the bits in x are independent of each
other, and the same for y.

It is not correct to assume that, and at the same time consider a
non-zero correlation between specific pairs of bits in x and y to
prove your result.

This is why:

There may be correlation between different bits of SHA output, but if
so they are a just as likely, a priori, between any pairs of bits.

Your assumption is that bits in x are independent of each other, and
unpredictable, and the same for y, yet there may be correlations
between the n'th bit of x and n'th bit of y for 0<=n<=79.

Given that, and the existence of a non-zero correlation:

entropy(x xor y) < entropy(x)
entropy(x xor y) < entropy(y)

Which justifies dropping half the bits instead of folding.

But wait!

Why should we drop the second 80 bits, instead of (say) every other bit?
All the bits from SHA are equivalent, right?

So, let's define some permutations. Remember, x and y are 80 bits wide.

a = [ x[ 0..39], y[ 0..39] ]
b = [ x[40..79], y[40..79] ]

Now, making exactly the same assumptions (x is perfectly distributed,
y the same), considering the same possible non-zero correlation (n'th
bit of x with n'th bit of y for any 0<=n<=79), we see it is _possible_
for the following to be true, although these assumptions alone don't
make it so:

entropy(a xor b) > entropy(a) (is possible)
entropy(a xor b) > entropy(b) (is possible)

This is because of the dependence between bits _within_ a or b,
causing entropy(a) < entropy(x) or entropy(b) < entropy(x). Whereas
entropy(a xor b) = entropy(x) if x[n] is independent of y[n'] for
40<=n<=79 and 40<=n'<=79, which is one additional assumption that can
make the above statement true.

Having just shown that, within the assumptions which prove entropy(x
xor y) < entropy(x), it is possible to have entropy(a xor b) >
entropy(a), where [a,b] is a permutation of [x,y], we see:

Your assumptions imply you should drop certain bits instead of other
bits, otherwise dropping can be worse than folding.

This is clearly absurd :)


Eh?
---

The reasoning error is to assume that certain pairs of bits are
independent and entertain the possibility of certain other pairs of
bits being dependent.

Of course bias in choosing those pairs favours one way of reducing the
hash output compared with another.

When you remove the bias? Now that is a more interesting question.

Perhaps you can show that dropping is better than folding, given no
bias in your correlation assumptions, but I don't see that in the
claim quoted at the start of this message.


Enjoy,
-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/