Re: [PATCH v2] RISC-V: Break load reservations during switch_to
From: Palmer Dabbelt
Date: Thu Jun 20 2019 - 21:50:37 EST
On Wed, 19 Jun 2019 00:36:01 PDT (-0700), mark.rutland@xxxxxxx wrote:
On Fri, Jun 07, 2019 at 03:22:22PM -0700, Palmer Dabbelt wrote:
The comment describes why in detail. This was found because QEMU never
gives up load reservations, the issue is unlikely to manifest on real
hardware.
Thanks to Carlos Eduardo for finding the bug!
@@ -330,6 +330,17 @@ ENTRY(__switch_to)
add a3, a0, a4
add a4, a1, a4
REG_S ra, TASK_THREAD_RA_RA(a3)
+ /*
+ * The Linux ABI allows programs to depend on load reservations being
+ * broken on context switches, but the ISA doesn't require that the
+ * hardware ever breaks a load reservation. The only way to break a
+ * load reservation is with a store conditional, so we emit one here.
+ * Since nothing ever takes a load reservation on TASK_THREAD_RA_RA we
+ * know this will always fail, but just to be on the safe side this
+ * writes the same value that was unconditionally written by the
+ * previous instruction.
+ */
I suspect that you need to do the same as 32-bit ARM, and clear this in
your exception return path, rather than in __switch_to, since handlers
for interrupts and other exceptions could leave a dangling reservation.
For ARM, the architecture permits a store-exclusive to succeed even if
the address differed from the load-exclusive. I don't know if the same
applies here, but regardless I believe the case above applies if an IRQ
is taken from kernel context, since the handler can manipulate the same
variable as the interrupted code.
RISC-V has the same constraint: an LR can cause the subsequent SC on any
address to succeed. When writing the patch I thought they had to have matching
addresses, v4 should have a correct comment (assuming I've managed to send it,
I'm on my third continent this week so I'm a bit out of it).
I'd considered breaking reservations on trap entry, but decided it wasn't
necessary. I hadn't considered doing this on trap exit, but I don't see a
difference so I might be missing something. The case that I see as an issue is
when a trap comes in the middle of an LR/SC sequence, which boils down to three
cases:
* The trap handler doesn't contend with the LR/SC sequence in any way, in which
case it's fine for the sequence to succeed.
* The trap handler contends by doing its own LR/SC sequence. Since the trap
handler must execute completely before returning control back the parent, we
know the SC in the trap handler will execute. Thus there is no way the SC in
the parent can pair with the LR in the trap handler. This applies even when
traps are nested.
* The trap handler contends by performing a regular store to the same address
as the LR that was interrupted. In this case the SC must fail, and while I
assumed that the store would cause that failure the ISA manual doesn't appear
to require that behavior -- it does allow the SC to always fail in that case,
but it doesn't mandate it always fails (which is how I got confused).
Assuming the ISA manual is correct in not specifying that stores within an
LR/SC sequence break the load reservations, then I think we do need to break
load reservations on all traps. I'll go check with the ISA people, but now
that I've noticed it this does seem somewhat reasonable -- essentially it lets
LR just take a line exclusively, SC to succeed only on already exclusively held
lines, and doesn't impose any extra constraints on regular memory operations.
I don't see any reason that breaking reservations on entry as opposed to exit
would be incorrect. I feel like doing this on entry is a better bet, as we're
already doing a bunch of stores there so I don't need to bring an additional
cache line in for writing. These lines would be on the kernel stack so it's
unlikely anyone else has them for writing anyway, so maybe it just doesn't
matter. The only issue I can think of here is that there might be something
tricky for implementations to handle WRT the regular store -> store conditional
forwarding, as that's a pattern that is unlikely to manifest on regular code
but is now in a high performance path. The regular load -> store conditional
may be easier to handle, and since this is done on the kernel stack it's
unlikely that upgrading a line for writing would be an expensive operation
anyway.
LMK if I'm missing something, otherwise I'll spin another version of the patch
to break reservations on trap entry.