Re: [PATCH] x86:pvclock: add missing barriers
From: Roman Kagan
Date: Wed Jun 08 2016 - 17:01:40 EST
On Wed, Jun 08, 2016 at 09:45:09PM +0200, Borislav Petkov wrote:
> On Wed, Jun 08, 2016 at 09:11:39PM +0300, Roman Kagan wrote:
> > Gradual removal of excessive barriers in pvclock reading functions
> > (commits 502dfeff239e8313bfbe906ca0a1a6827ac8481b,
> > a3eb97bd80134ba07864ca00747466c02118aca1) ended up removing too much:
> > although rdtsc is now orderd WRT other loads, there's no protection
> > against the compiler reordering the loads of ->version with the loads of
> > other fields.
> >
> > E.g. on my system gcc-5.3.1 generates code which loads ->system_time and
> > ->flags outside of the ->version test loop.
> >
> > (Re)introduce the compiler barriers around accesses to the contents of
> > pvclock. While at this, make the function a bit more compact by
> > removing unnecessary local variables.
> >
> > Signed-off-by: Roman Kagan <rkagan@xxxxxxxxxxxxx>
> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> > Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
> > Cc: x86@xxxxxxxxxx
> > Cc: Andy Lutomirski <luto@xxxxxxxxxx>
> > Cc: Borislav Petkov <bp@xxxxxxx>
> > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> > Cc: stable@xxxxxxxxxxxxxxx
> > ---
> > arch/x86/include/asm/pvclock.h | 17 +++++------------
> > 1 file changed, 5 insertions(+), 12 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/pvclock.h b/arch/x86/include/asm/pvclock.h
> > index fdcc040..65c4de2 100644
> > --- a/arch/x86/include/asm/pvclock.h
> > +++ b/arch/x86/include/asm/pvclock.h
> > @@ -80,18 +80,11 @@ static __always_inline
> > unsigned __pvclock_read_cycles(const struct pvclock_vcpu_time_info *src,
> > cycle_t *cycles, u8 *flags)
> > {
> > - unsigned version;
> > - cycle_t ret, offset;
> > - u8 ret_flags;
> > -
> > - version = src->version;
> > -
> > - offset = pvclock_get_nsec_offset(src);
> > - ret = src->system_time + offset;
> > - ret_flags = src->flags;
> > -
> > - *cycles = ret;
> > - *flags = ret_flags;
> > + unsigned version = src->version;
> > + barrier();
> > + *cycles = src->system_time + pvclock_get_nsec_offset(src);
> > + *flags = src->flags;
> > + barrier();
> > return version;
>
> I have a similar patchset in my mbox starting here:
>
> https://lkml.kernel.org/r/1464329832-4638-1-git-send-email-mnghuan@xxxxxxxxx
>
> Care to take a look?
Just did, thanks for the link.
The difference is whether you want the reader to see consistent view of
the pvclock data (as in my patch) or also the most up to date one
(as in Minfei Huang's patch) at the cost of extra lfence instructions
(on my machine this is 30% slowdown).
I'm not sure if the latter is really necessary. If it is, then the
lfence or mfence in rdtsc_ordered() becomes excessive. Perhaps we'd
have to revert to rdtsc_barrier() to surround pvclock data access, and
plain rdtsc() instead of rdtsc_ordered().
Roman.