Re: [PATCH] x86: Add an explicit barrier() to clflushopt()

From: Linus Torvalds
Date: Tue Jan 12 2016 - 23:39:53 EST


On Tue, Jan 12, 2016 at 6:42 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>
> Since barriers are on my mind: how strong a barrier is needed to
> prevent cache fills from being speculated across the barrier?

I don't think there are *any* architectural guarantees.

I suspect that a real serializing instruction should do it. But I
don't think even that is guaranteed.

Non-coherent IO is crazy. I really thought Intel had learnt their
lesson, and finally made all the GPU's coherent. I'm afraid to even
ask why Chris is actually working on some sh*t that requires clflush.

In general, you should probably do something nasty like

- flush before starting IO that generates data (to make sure you have
no dirty cachelines that will write back and mess up)

- start the IO, wait for it to complete

- flush after finishing IO that generates the data (to make sure you
have no speculative clean cachelines with stale data)

- read the data now.

Of course, what people actually end up doing to avoid all this is to
mark the memory noncacheable.

And finally, the *correct* thing is to not have crap hardware, and
have IO be cache coherent. Things that don't do that are shit. Really.

Linus