Re: [RFC PATCH] x86/arch_prctl: Add ARCH_SET_XCR0 to mask XCR0 per-thread
From: Andi Kleen
Date: Tue Jun 19 2018 - 09:43:52 EST
> In particular 1) means that any extra instructions executed/not executed
> will cause a replay divergence (in practice rr uses retired conditional
> branches rather than instructions, because the instruction counter is
> not accurate, while the branch one is). This alone causes a problem
> for the present case, because glibc branches on the xcr0 value before
> it branches on the cpuid value for AVX512. Glibc does check for the
> correct cpuid before calling xgetbv, so one possible thing to do is to
> completely disable xsave during recording by disabling it in CPUID, but
> that would make rr quite a bit less useful, since it wouldn't be able to
Ah I see it now. This problem was introduced with the changes
for glibc to save AVX registers using XSAVE instead of manually.
It still seems this has a straight forward fix in glibc though.
It could always allocate the worst case buffer, and also
verify XGETBV against CPUID first. I'm sure this can be
done in a way that executed branches don't differ.
AFAIK manual use of XSAVE is not that common, so hopefully
these problems are not wide spread in other programs.
Of course longer term you'll just need to have matching
ISAs in record and replay. Trying to patch around this
is likely always difficult.
-Andi