Re: [PATCH v2] buffer: Fix I/O error due to ARM read-after-read hazard
From: Will Deacon
Date: Tue Nov 12 2019 - 13:00:43 EST
On Tue, Nov 12, 2019 at 04:08:57PM +0000, Catalin Marinas wrote:
> On Tue, Nov 12, 2019 at 02:02:44PM +0100, Vincent Whitchurch wrote:
> > On my dual-core ARM Cortex-A9, reading from squashfs (over
> > dm-verity/ubi/mtd) in a loop for hundreds of hours invariably results in
> > a read failure in squashfs_read_data(). The errors occur because the
> > buffer_uptodate() check fails after wait_on_buffer(). Further debugging
> > shows that the bh was in fact uptodate and that there is no actual I/O
> > error in the lower layers.
> >
> > The problem is caused by the read-after-read hazards in the ARM
> > Cortex-A9 MPCore (erratum #761319, see [1]). The code generated by the
> > compiler for the combination of the wait_on_buffer() and
> > buffer_uptodate() calls reads the flags value twice from memory (see the
> > excerpt of the assembly below). The new value of the BH_Lock flag is
> > seen but the new value of BH_Uptodate is not even though both the bits
> > are read from the same memory location.
> >
> > 27c: 9d08 ldr r5, [sp, #32]
> > 27e: 2400 movs r4, #0
> > 280: e006 b.n 290 <squashfs_read_data+0x290>
> > 282: 6803 ldr r3, [r0, #0]
> > 284: 07da lsls r2, r3, #31
> > 286: f140 810d bpl.w 4a4 <squashfs_read_data+0x4a4>
> > 28a: 3401 adds r4, #1
> > 28c: 42bc cmp r4, r7
> > 28e: da08 bge.n 2a2 <squashfs_read_data+0x2a2>
> > 290: f855 0f04 ldr.w r0, [r5, #4]!
> > 294: 6803 ldr r3, [r0, #0]
> > 296: 0759 lsls r1, r3, #29
> > 298: d5f3 bpl.n 282 <squashfs_read_data+0x282>
> > 29a: f7ff fffe bl 0 <__wait_on_buffer>
> >
> > Work around this problem by adding a DMB between the two reads of
> > bh->flags, as recommended in the ARM document. With this barrier, no
> > failures have been seen in more than 5000 hours of the same test.
> >
> > [1] http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf
>
> I thought we were going to fix the compiler. I found an old thread here:
>
> https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00714.html
>
> Also cc'ing Richard Earnshaw as he may been involved in the gcc
> discussion at the time.
>
> While you can add some barrier here, there may be other cases where this
> can go wrong.
Hmm, and afaict, even if the compiler was modified to emit LDREX instructions
for volatile loads, it wouldn't help in this case because test_bit() isn't
using READ_ONCE().
It's also slightly odd that the proposed patch makes the code look like:
for (i = 0; i < b; i++) {
if (buffer_locked(bh)) {
__wait_on_buffer(bh);
smp_rmb();
}
if (!buffer_uptodate(bh[i]))
goto block_release;
}
whereas there are other potential RAR orderings between buffer_locked()
and __wait_on_buffer() and also probably between successive iterations
of the loop.
So, really, the only way I see to solve this is for us to use READ_ONCE
consistently for all relaxed atomic loads (KCSAN is starting to tread on
this), and then to patch READ_ONCE to emit a DMB at runtime for arch/arm/
(maybe a static key would work if you can avoid the recursion).
I've already got patches at [1] to allow architectures to override
READ_ONCE, because Alpha needs to do something similar.
Will
[1] https://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git/log/?h=lto