Re: [GIT PULL] CRC updates for 6.14
From: David Laight
Date: Thu Jan 23 2025 - 18:18:04 EST
On Thu, 23 Jan 2025 13:16:03 -0800
Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
> On Thu, Jan 23, 2025 at 08:58:10PM +0000, David Laight wrote:
...
> > For a small memory footprint it might be worth considering 4 bits at a time.
> > So a 16 word (64 byte) lookup table.
> > Thinks....
> > You can xor a data byte onto the crc 'accumulator' and then do two separate
> > table lookups for each of the high nibbles and xor both onto it before the rotate.
> > That is probably a reasonable compromise.
>
> Yes, you can do less than a byte at a time (currently one of the choices is even
> one *bit* at a time!), but I think byte-at-a-time is small enough already.
I used '1 bit at a time' for a crc64 of a 5MB file.
Actually fast enough during a 'compile' phase (verified by a serial eeprom).
But the paired nibble one is something like:
crc ^= *data++ << 24;
crc ^= table[crc >> 28] ^ table1[(crc >> 24) & 15];
crc = rol(crc, 8);
which isn't going to be significantly slower than the byte one
where the middle line is:
crc ^= table[crc >> 24];
especially for a multi-issue cpu,
and the table drops from 1k to 128 bytes.
That is quite a lot of D-cache misses.
(Since you'll probably get them all twice when the program's working
set is reloaded!)
Actually you need to rol() the table[]s.
Then do:
crc = rol(crc, 8) ^ table[] ...
to reduce the register dependency chain to 5 per byte.
David