Re: [PATCH 0/5] arm64: Add workaround for Cortex-A77 erratum 1542418

From: Suzuki K Poulose
Date: Thu Nov 14 2019 - 20:10:33 EST

Next message: Andreas FÃrber: "Re: [PATCH 7/7] arm64: dts: realtek: Add RTD1395 and BPi-M4"
Previous message: Ming Lei: "Re: single aio thread is migrated crazily by scheduler"
In reply to: Will Deacon: "Re: [PATCH 0/5] arm64: Add workaround for Cortex-A77 erratum 1542418"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Will

On 11/14/2019 04:39 PM, Will Deacon wrote:

Hi Suzuki,

On Thu, Nov 14, 2019 at 02:59:13PM +0000, Suzuki K Poulose wrote:

This series adds workaround for Arm erratum 1542418 which affects

Searching for that erratum number doesn't find me a description :(

I believe this was published in the Cortex-A77 SDEN v9.0. I will chase
it internally.

Cortex-A77 cores (r0p0 - r1p0). Affected cores may execute stale
instructions from the L0 macro-op cache violating the
prefetch-speculation-protection guaranteed by the architecture.
This happens when the when the branch predictor bases its predictions
on a branch at this address on the stale history due to ASID or VMID
reuse.

Two immediate questions:

1. Can we disable the L0 MOP cache?

Yes, but it hurts performance.

2. Can we invalidate the branch predictor? If Spectre-v2 taught us
anything it's that removing those instructions was a mistake!

The workaround suggested is actually invalidating the branch history
but in a costly way. I am unaware of any.

Moving on...

Have you reproduced this at top-level? If I recall the
prefetch-speculation-protection, it's designed to protect against the
case where you have a direct branch:

No, see below.

addr: B foo

and another CPU writes out a new function:

bar:
insn0
...
insnN

before doing any necessary maintenance and then patches the original
branch to:

addr: B bar

The idea is that a concurrently executing CPU could mispredict the original
branch to point at 'bar', fetch the instructions before they've been written
out and then confirm the prediction by looking at the newly written branch
instruction. Even without the prefetch-speculation-protection, that's
fairly difficult to achieve in practice: you'd need to be doing something
like reusing memory to hold the instructions so that the initial
misprediction occurs.

How does A77 stop this from occurring when the ASID is not reallocated (e.g.
the example above)? Is the MOP cache flushed somehow?

IIUC, The MOP cache is flushed on I-cache invalidate, thus it is fine.

With this erratum, it sounds like you have to end up reusing an ASID from
a task that had a branch at 'addr' in its address space that branched to
the address of 'bar' (again. in its address space). Is that right? That
sounds super rare to me, particularly with ASLR: not only does the aliasing

AFAICS, yes and on top of that, it should also miss "addr" in MOP-cache
and hit "bar" before the I-cache invalidate is received. This may cause
the "bar" to be fetched from mop (and is not canceled even though there
was a mop-flush triggered by the i-cache invalidate after the hit) and
"addr" should miss in I-cache, causing it to fetch the updated instruction.

Also this means that the new context must not have executed "addr"
(which would give a hit in MOP-cache) while "bar" was fetched. So,
this adds on more constraints to actually hit it.

branch need to exist, but it needs to be held in the branch predictor while
we cycle through 64k ASIDs *and* the race with the writer needs to happen
so that we get stale instructions from the MOP cache.

Is there something I'm missing that makes this remotely plausible?

No :-)

Cheers
Suzuki

Next message: Andreas FÃrber: "Re: [PATCH 7/7] arm64: dts: realtek: Add RTD1395 and BPi-M4"
Previous message: Ming Lei: "Re: single aio thread is migrated crazily by scheduler"
In reply to: Will Deacon: "Re: [PATCH 0/5] arm64: Add workaround for Cortex-A77 erratum 1542418"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]