Re: [RFT/PATCH v2 2/6] x86-64: Optimize vread_tsc's barriers

From: Ingo Molnar
Date: Sat Apr 09 2011 - 07:51:33 EST



* Andrew Lutomirski <luto@xxxxxxx> wrote:

> > * Modulo errata, BIOS bugs, implementation bugs, etc.
>
> As far as I can tell, on Sandy Bridge and Bloomfield, I can't get the
> sequence lfence;rdtsc to violate the rule above. That the case even if I
> stick random arithmetic and branches right before the lfence. If I remove
> the lfence, though, it starts to fail. (This is without the evil fake
> barrier.)

It's not really evil, just too tricky and hence very vulnerable to entropy ;-)

> However, as expected, I can see stores getting reordered after lfence;rdtsc
> and rdtscp but not mfence;rdtsc.

Is this lfence;rdtsc variant enough for your real usecase as well?

Basically, we are free to define whatever sensible semantics we find reasonable
and fast - we are pretty free due to the fact that the whole TSC picture was
such a mess for a decade or so, so apps did not make assumptions (because we
could not make guarantees).

> So... do you think that the rule is sensible?

The barrier properties of this system call are flexible in the same sense so
your proposal is sensible to me. I'd go for the weakest barrier that still
works fine, that is the one that is the fastest and it also gives us the most
options for the future.

> I'll post the test case somewhere when it's a little less ugly. I'd like to
> see test results on AMD.

That would be nice - we could test it on various Intel and AMD CPUs.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/