On Thu, May 23, 2024 at 03:13:23PM -0700, Yang Shi wrote:
On 5/23/24 2:34 PM, Catalin Marinas wrote:Or we have a sorted table of exclusions and do a binary search. Not sure
On Thu, May 23, 2024 at 12:43:34PM -0700, Christoph Lameter (Ampere) wrote:If I read the instruction encoding correctly, the unallocated instructions
On Thu, 23 May 2024, Catalin Marinas wrote:Yeah, check the atomic class first and then go into the finer-grained
Hmmm. We could consult an exception table once the pattern matches to reduceYes, it may be easier to exclude those patterns. See the Arm ARM K.aWhile this class includes all atomics that currently require writeCan you tell us which bits or pattern is not allocated? Maybe we can exclude
permission, there's some unallocated space in this range and we don't
know what future architecture versions may introduce. Unfortunately we
need to check each individual atomic op in this class (not sure what the
overhead will be).
that from the pattern.
section C4.1.94.29 (page 791).
the overhead.
details. I think this would reduce the overhead for non-atomic
instructions.
are decided by the below fields:
- size
- VAR
- o3
- opc
To exclude them I think we can do something like:
if atomic instructions {
if V == 1
return false;
if o3 opc == 111x
return false;
switch VAR {
000
check o3 and opc
001
check 03 and opc
010
check o3 and opc
011
check o3 and opc
default
if size != 11
check o3 and opc
}
}
So it may take 4 + the possible unallocated combos of o3 and opc branches
for the worst case. I saw 5 different combos for o3 and opc, so 9 branches
for worst cases.
which one is faster.
But if they will be allocated to non-atomic instructions, we have to doActually, the atomics table has LD64B and LDAPR already which are load
fine-grained decoding, but it may be easier since we can just filter out
those non-atomic instructions? Anyway it depends on how they will be used.
Hopefully this won't happen.
instructions, no write permission needed. So we need to exclude these
and all the unallocated space in this range.