Re: [v2 PATCH] arm64: mm: force write fault for atomic RMW instructions

From: Yang Shi
Date: Thu May 23 2024 - 18:13:39 EST




On 5/23/24 2:34 PM, Catalin Marinas wrote:
On Thu, May 23, 2024 at 12:43:34PM -0700, Christoph Lameter (Ampere) wrote:
On Thu, 23 May 2024, Catalin Marinas wrote:
While this class includes all atomics that currently require write
permission, there's some unallocated space in this range and we don't
know what future architecture versions may introduce. Unfortunately we
need to check each individual atomic op in this class (not sure what the
overhead will be).
Can you tell us which bits or pattern is not allocated? Maybe we can exclude
that from the pattern.
Yes, it may be easier to exclude those patterns. See the Arm ARM K.a
section C4.1.94.29 (page 791).
Hmmm. We could consult an exception table once the pattern matches to reduce
the overhead.
Yeah, check the atomic class first and then go into the finer-grained
details. I think this would reduce the overhead for non-atomic
instructions.

If I read the instruction encoding correctly, the unallocated instructions are decided by the below fields:

  - size
  - VAR
  - o3
  - opc

To exclude them I think we can do something like:

if atomic instructions {
    if V == 1
        return false;
    if o3 opc == 111x
        return false;
    switch VAR {
        000
            check o3 and opc
        001
            check 03 and opc
        010
            check o3 and opc
        011
            check o3 and opc
        default
            if size != 11
                check o3 and opc
    }
}

So it may take 4 + the possible unallocated combos of o3 and opc branches for the worst case. I saw 5 different combos for o3 and opc, so 9 branches for worst cases.


However, the harm done I think is acceptable even if we leave things as is.
In the worst case we create unnecesssary write fault processing for an
"atomic op" that does not need write access. Also: Why would it need to be
atomic if it does not write???
I'm thinking of some conditional instruction that states no write if
condition fails. But it could be even worse if the architects decide to
reuse that unallocated space for some instructions that have nothing to
do with the atomic accesses.

Even though the condition fails, forcing write fault still seems fine IIUC. I'm supposed the read fault will happen regardless of the condition. Then a page with all 0 content is installed. This is guaranteed. We just end up having write permission instead of read-only permission. We will also be in this state transiently with current supported atomic instructions.

But if they will be allocated to non-atomic instructions, we have to do fine-grained decoding, but it may be easier since we can just filter out those non-atomic instructions? Anyway it depends on how they will be used. Hopefully this won't happen.


It's something we need to clarify with them but I'm about to go on
holiday for a week, so I won't be able to check.

Have a good holiday.


The ultimate solution would be to change the spec so that arm processors can
skip useless read faults.
I raised this already, waiting for feedback from the architects.

Thank you so much.