Re: Aarch64 EXT4FS inode checksum failures - seems to be weak memory ordering issues

From: Eric Biggers
Date: Thu Jan 07 2021 - 17:28:35 EST


On Thu, Jan 07, 2021 at 10:48:05PM +0100, Arnd Bergmann wrote:
> On Thu, Jan 7, 2021 at 5:27 PM Theodore Ts'o <tytso@xxxxxxx> wrote:
> >
> > On Thu, Jan 07, 2021 at 01:37:47PM +0000, Russell King - ARM Linux admin wrote:
> > > > The gcc bugzilla mentions backports into gcc-linaro, but I do not see
> > > > them in my git history.
> > >
> > > So, do we raise the minimum gcc version for the kernel as a whole to 5.1
> > > or just for aarch64?
> >
> > Russell, Arnd, thanks so much for tracking down the root cause of the
> > bug!
>
> There is one more thing that I wondered about when looking through
> the ext4 code: Should it just call the crc32c_le() function directly
> instead of going through the crypto layer? It seems that with Ard's
> rework from 2018, that can just call the underlying architecture specific
> implementation anyway.
>

It looks like that would work, although note that crc32c_le() uses the shash API
too, so it isn't any more "direct" than what ext4 does now.

Also, a potential issue is that the implementation of crc32c that crc32c_le()
uses might be chosen too early if the architecture-specific implementation of
crc32c is compiled as a module (e.g. crc32c-intel.ko). There are two ways this
could be fixed -- either by making it a proper library API like blake2s() that
can call the architecture-specific code directly, or by reconfiguring things
when a new crypto module is loaded (like what lib/crc-t10dif.c does).

Until one of those is done, switching to crc32c_le() might cause performance
regressions.

- Eric