Re: [RFC] arm64: Enforce observed order for spinlock and data
Date: Thu Oct 13 2016 - 16:01:55 EST
On 2016-10-13 07:02, Will Deacon wrote:
On Wed, Oct 12, 2016 at 04:01:06PM -0400, bdegraaf@xxxxxxxxxxxxxx
Everything from this point down needs clarification.
All arm64 lockref accesses that occur without taking the spinlock must
behave like true atomics, ensuring successive operations are all done
What is a "true atomic"? What do you mean by "successive"? What do you
mean by "done sequentially"?
Despite the use case in dentry, lockref itself is a generic locking API,
any problems I describe here are with the generic API itself, not
the dentry use case. I'm not patching dentry--I'm fixing lockref.
By necessity, the API must do its update atomically, as keeping things
involves potential spinlock access by other agents which may opt to use
spinlock API or the lockref API at their discretion. With the current
spinlock implementation, it is possible for the lockref to observe the
contents of the protected count without observing the spinlock being
which could lead to missed changes to the lock_count itself, because any
calculations made on it could be overwritten or completed in a different
(Spinlock locked access is obtained with a simple store under certain
scenarios. My attempt to fix this in the spinlock code was met with
saying it should be addressed in lockref, since that is the API that
encounter the issue. These changes were initiated in response to that
Additional ordering problems were uncovered when I looked into lockref
The example below involves only a single agent exactly as you explain
problem in commit 8e86f0b409a44193f1587e87b69c5dcf8f65be67. Even for a
execution agent, this means that the code below could access out of
As lockref is a generic API, it doesn't matter whether dentry does this
By "done sequentially," I mean "accessed in program order."
As far as "true atomic" goes, I am referring to an atomic in the same
did in commit 8e86f0b409a44193f1587e87b69c5dcf8f65be67.
The guarantee provided by lockref is that, if you hold the spinlock,
you don't need to use atomics to inspect the reference count, as it is
guaranteed to be stable. You can't just go around replacing spin_lock
calls with lockref_get -- that's not what this is about.
I am not sure where you got the idea that I was referring to replacing
with lockref calls. That is not the foundation for this fix.
the lockref accesses, when decompiled, look like the following
<Lockref "unlocked" Access [A]>
// Lockref "unlocked" (B)
1: ldxr x0, [B] // Exclusive load
<change lock_count B>
stxr w1, x0, [B]
cbnz w1, 1b
<Lockref "unlocked" Access [C]>
Even though access to the lock_count is protected by exclusives, this
to guarantee order: The lock_count must change atomically, in order,
permitted ordering would be:
A -> B -> C
Says who? Please point me at a piece of code that relies on this. I'm
willing to believe that are bugs in this area, but waving your hands
and saying certain properties "must" hold is not helpful unless you can
say *why* they must hold and *where* that is required.
The lockref code must access in order, because other agents can observe
spinlock OR lockref APIs. Again, this is a generic API, not an explicit
dentry. Other code will use it, and the manner in which it is used in
dentry is not
relevant. What lock_count is changed to is not proscribed by the
API. There is no guarantee whether it be an add, subtract, multiply,
to some explicit value, etc. But the changes must be done in program
observable in that same order by other agents: Therefore, the spinlock
must be accessed atomically, and observed to change atomically at the
I am not off base saying lockref is an atomic access. Here are some
Under Documentation/filesystems/path-lookup.md, the dentry->d_lockref
is described as an atomic access.
At the time lockref was introduced, The Linux Foundation gave a
LinuxCon 2014 that can be found at the following link:
On page 46, it outlines the lockref API. The first lines of the slide
â *Generic* mechanism to *atomically* update a reference count that is
protected by a spinlock without actually acquiring the spinlock
While dentry's use is mentioned, this API is not restricted to the use
case of dentry.
Unfortunately, this is not the case by the letter of the architecture
the accesses to A and C are not protected by any sort of barrier, and
permitted to reorder freely, resulting in orderings such as
Bl -> A -> C -> Bs
Again, why is this a problem? It's exactly the same as if you did:
Accesses outside of the critical section can still be reordered. Big
Since the current code resembles but actually has *fewer* ordering
compared to the example used by your atomic.h commit, even though
A->B->C is in
program order, it could access out of order according to your own commit
on commit 8e86f0b409a44193f1587e87b69c5dcf8f65be67.
Taking spin_lock/spin_unlock, however, includes ordering by nature of
load-acquire observing the store-release of a prior unlock, so ordering
enforced with the spinlock version of accesses. The lockref itself has
ordering enforced, unless a locked state is encountered and it falls
to the spinlock code. So this is a fundamental difference between
spinlock. So, no, lockref ordering is currently not exactly the same as
spinlock--but it should be.
In this specific scenario, since "change lock_count" could be an
increment, a decrement or even a set to a specific value, there could
Take, for example, a use case where the ref count is either positive or
If increments and decrements hit out of order, a decrement that was
to come after an increment would instead do nothing if the value of the
started at zero. Then when the increment hit later, the ref count would
positive with a net effect of +1 to the ref count instead of +1-1=0.
however, the lockref does not specify how the contents of lock_count are
manipulated, it was only meant to guarantee that they are done
the lock is not held.
With more agents accessing the lockref without taking the lock, even
scenarios where the cmpxchg passes falsely can be encountered, as
no guarantee that the the "old" value will not match exactly a newer
due to out-of-order access by a combination of agents that increment
decrement the lock_count by the same amount.
This is the A-B-A problem, but I don't see why it affects us here.
dealing with a single reference count.
If lockref accesses were to occur on many Pe's, there are all sorts of
things that could happen in terms of who wins what, and what they set
lock_count to. My point is simply that each access should be atomic
lockref is a generic API and was intended to be a lockless atomic
Leaving this problem open until someone else introduces a use that
it, which could happen in the main kernel code, is probably not a good
idea, as it could prove difficult to track down.
Since multiple agents are accessing this without locking the spinlock,
this access must have the same protections in place as atomics do in
Why? I don't think that it does. Have a look at how lockref is used by
the dcache code: it's really about keeping a reference to a dentry,
which may be in the process of being unhashed and removed. The
interaction with concurrent updaters to the dentry itself is handled
using a seqlock, which does have the necessary barriers. Yes, the code
is extremely complicated, but given that you're reporting issues based
on code inspection, then you'll need to understand what you're
Again, this is a generic API, not an API married to dentry. If it were
dentry's sole use, it should not be accessible outside of the dentry
While the cmpxchg64_relaxed case may be OK for dentry, it is not OK for
Fortunately, the fix is not complicated: merely removing the errant
_relaxed option on the cmpxchg64 is enough to introduce exactly the
code sequence justified in commit
to fix arm64 atomics.
I introduced cmpxchg64_relaxed precisely for the lockref case. I still
don't see a compelling reason to strengthen it. If you think there's a
please spend the effort to describe how it manifests and what can
go wrong in the existing codebase. Your previous patches fixing
bugs found by inspection have both turned out to be bogus, so I'm
but I'm not exactly leaping on your contributions to this.
I have detailed the problems here, and they are with the generic case,
hand waving required.
On a further note, it is not accurate to say that my prior patches were
bogus: One called to attention a yet-to-be-corrected problem in the
Programmer's Guide, and the other was sidestepped by a refactor that
addressed the problem I set out to fix with a control flow change. Since
that problem was the fundamental reason I had worked on the gettime code
in the first place, I abandoned my effort. The refactor that fixed the
control-flow problem, however, is still missing on v4.7 and earlier
(sequence lock logic should be verified prior to the isb that demarcates
the virtual counter register read). I have confirmed this is an issue on
various armv8 hardware, sometimes obtaining identical register values
between multiple reads that were delayed such that they should have
changes, evidence that the register read accessed prior to the seqlock
update having finished (the control flow problem).