Re: Aarch64 EXT4FS inode checksum failures - seems to be weak memory ordering issues
From: Russell King - ARM Linux admin
Date: Thu Jan 07 2021 - 07:46:23 EST
On Thu, Jan 07, 2021 at 11:18:41AM +0000, Russell King - ARM Linux admin wrote:
> On Wed, Jan 06, 2021 at 10:32:23PM +0000, Russell King - ARM Linux admin wrote:
> > On Wed, Jan 06, 2021 at 05:20:34PM +0000, Will Deacon wrote:
> > > With that, I see the following after ten seconds or so:
> > >
> > > EXT4-fs error (device sda2): ext4_lookup:1707: inode #674497: comm md5sum: iget: checksum invalid
> > >
> > > Russell, Mark -- does this recipe explode reliably for you too?
> >
> > I've been working this evening on tracking down what change in the
> > Kconfig file between your working 5.10 kernel binary you supplied me,
> > and my failing 5.9 kernel.
> >
> > I've found that _enabling_ CONFIG_STACKPROTECTOR appears to mask the
> > inode checksum failure problem, at least from a short test.) I'm going
> > to re-enable CONFIG_STACKPROTECTOR and leave it running for longer.
> >
> > That is:
> >
> > CONFIG_STACKPROTECTOR=y
> > CONFIG_STACKPROTECTOR_STRONG=y
> >
> > appears to mask the problem
> >
> > # CONFIG_STACKPROTECTOR is not set
> >
> > appears to unmask the problem.
>
> We have finally got to the bottom of this - the "bug" is in the ext4
> code:
>
> static inline u32 ext4_chksum(struct ext4_sb_info *sbi, u32 crc,
> const void *address, unsigned int length)
> {
> struct {
> struct shash_desc shash;
> char ctx[4];
> } desc;
>
> BUG_ON(crypto_shash_descsize(sbi->s_chksum_driver)!=sizeof(desc.ctx));
>
> desc.shash.tfm = sbi->s_chksum_driver;
> *(u32 *)desc.ctx = crc;
>
> BUG_ON(crypto_shash_update(&desc.shash, address, length));
>
> return *(u32 *)desc.ctx;
> }
>
> This isn't always inlined, despite the "inline" keyword. With GCC
> 4.9.4, this is compiled to the following code when the stack protector
> is disabled:
>
> 0000000000000004 <ext4_chksum.isra.14.constprop.19>:
> 4: a9be7bfd stp x29, x30, [sp, #-32]! <------
> 8: 2a0103e3 mov w3, w1
> c: aa0203e1 mov x1, x2
> 10: 910003fd mov x29, sp <------
> 14: f9000bf3 str x19, [sp, #16]
> 18: d10603ff sub sp, sp, #0x180 <------
> 1c: 9101fff3 add x19, sp, #0x7f
> 20: b9400002 ldr w2, [x0]
> 24: 9279e273 and x19, x19, #0xffffffffffffff80 <------
> 28: 7100105f cmp w2, #0x4
> 2c: 540001a1 b.ne 60 <ext4_chksum.isra.14.constprop.19+0x5c> // b.any
> 30: 2a0303e4 mov w4, w3
> 34: aa0003e3 mov x3, x0
> 38: b9008264 str w4, [x19, #128]
> 3c: aa1303e0 mov x0, x19
> 40: f9000263 str x3, [x19] <------
> 44: 94000000 bl 0 <crypto_shash_update>
> 44: R_AARCH64_CALL26 crypto_shash_update
> 48: 350000e0 cbnz w0, 64 <ext4_chksum.isra.14.constprop.19+0x60>
> 4c: 910003bf mov sp, x29 <======
> 50: b9408260 ldr w0, [x19, #128] <======
> 54: f9400bf3 ldr x19, [sp, #16]
> 58: a8c27bfd ldp x29, x30, [sp], #32
> 5c: d65f03c0 ret
> 60: d4210000 brk #0x800
> 64: 97ffffe7 bl 0 <ext4_chksum.isra.14.part.15>
>
> Of the instructions that are highlighted with "<------" and "<======",
> x29 is located at the bottom of the function's stack frame, excluding
> local variables. x19 is "desc", which is calculated to be safely below
> x29 and aligned to a 128 byte boundary.
>
> The bug is pointed to by the two "<======" markers - the instruction
> at 4c restores the stack pointer _above_ "desc" before then loading
> desc.ctx.
>
> If an interrupt occurs right between these two instructions, then
> desc.ctx will be corrupted, leading to the checksum failing.
>
> Comments on irc are long the lines of this being "an impressive
> compiler bug".
>
> We now need to find which gcc versions are affected, so we know what
> minimum version to require for aarch64.
>
> Arnd has been unable to find anything in gcc bugzilla to explain this;
> he's tested gcc-5.5.0, which appears to produce correct code, and is
> trying to bisect between 4.9.4 and 5.1.0 to locate where this was
> fixed.
>
> Peter Zijlstra suggested adding linux-toolchains@ and asking compiler
> folks for feedback on this bug. I guess a pointer to whether this is
> a known bug, and which bug may be useful.
>
> I am very relieved to have found a positive reason for this bug, rather
> than just moving forward on the compiler and have the bug vanish
> without explanation, never knowing if it would rear its head in future
> and corrupt my filesystems, e.g. never knowing if it became a
> temporarily masked memory ordering bug.
Arnd has found via bisecting gcc:
7e8c2bd54af ("[AArch64] fix unsafe access to deallocated stack")
which seems to be https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63293
That seems to suggest that gcc-5.0.0 is also affected.
Looking at the changelog in Debian's gcc-8.3 packages, this doesn't
feature, so it's not easy just to look at the changelogs to work out
which versions are affected.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!