Re: [PATCH 1/2] MIPS: Add barriers between dcache & icache flushes

From: Paul Burton
Date: Mon Feb 29 2016 - 21:24:12 EST

On Mon, Feb 22, 2016 at 06:39:30PM -0500, Joshua Kinard wrote:
> On 02/22/2016 13:09, Paul Burton wrote:
> > Index-based cache operations may be arbitrarily reordered by out of
> > order CPUs. Thus code which writes back the dcache & then invalidates
> > the icache using indexed cache ops must include a barrier between
> > operating on the 2 caches in order to prevent the scenario in which:
> >
> > - icache invalidation occurs.
> >
> > - icache fetch occurs, due to speculation.
> >
> > - dcache writeback occurs.
> >
> > If the above were allowed to happen then the icache would contain stale
> > data. Forcing the dcache writeback to complete before the icache
> > invalidation avoids this.
> Is there a particular symptom one should look for to check for this issue
> occurring? I haven't seen any odd effects on my SGI systems that appear to
> relate to this. I believe the R1x000 family resolves all hazards in hardware,
> so maybe this issue doesn't affect that CPU family?
> If not, let me know what to look or test for so I can check the patch out on my
> systems.
> Thanks!
> --J

Hi Joshua,

It depends upon the implementation of the CPU, but the arch spec (MIPS64
BIS, MD00087, revision 6.02) does say:

> When implementing multiple level of caches and where the hardware maintains
> the smaller cache as a proper subset of a larger cache (every address which is
> resident in the smaller cache is also resident in the larger cache; also known
> as the inclusion property). It is recommended that the CACHE instructions
> which operate on the larger, outer-level cache; must first operate on the
> smaller, inner-level cache. For example, a Hit_Writeback _Invalidate operation
> targeting the Secondary cache, must first operate on the primary data
> cache first. If the CACHE instruction implementation does not follow
> this policy then any software which flushes the caches must mimic this
> behavior. That is, the software sequences must first operate on the
> inner cache then operate on the outer cache. The software must place a
> SYNC instruction after the CACHE instruction whenever there are
> possible writebacks from the inner cache to ensure that the writeback
> data is resident in the outer cache before operating on the outer
> cache. If neither the CACHE instruction implementation nor the
> software cache flush sequence follow this policy, then the inclusion
> property of the caches can be broken, which might be a condition that
> the cache management hardware cannot properly deal with.
> When implementing multiple level of caches without the inclusion
> property, the use of a SYNC instruction after the CACHE instruction is
> still needed whenever writeback data has to be resident in the next
> level of memory hierarchy.

If data is to transfer from dcache -> L2 -> icache then it has to be
written back to the L2 which would hit that situation of the data
needing "to be resident in the next level of memory hierarchy" after the
dcache. That is guaranteed by the sync instruction:

> The CACHE instruction and the memory transactions which are sourced by
> the CACHE instruction, such as cache refill or cache writeback, obey
> the ordering and completion rules of the SYNC instruction.

This is more something newer cores that reorder more agressively would
be expected to hit, to the best of my knowledge.