Re: [RFC PATCH] x86/arch_prctl: Add ARCH_SET_XCR0 to mask XCR0 per-thread

From: Keno Fischer
Date: Mon Jun 18 2018 - 13:51:16 EST


> So we're talking about a workaround for broken software. The question
> is how wide spread is it?

For rr to work, it tries to replicate the process state *exactly*. That means:

1. The same instructions executed in the same order
2. The exact same register state at those instructions
3. The same memory state, bit-by-bit

In particular 1) means that any extra instructions executed/not executed
will cause a replay divergence (in practice rr uses retired conditional
branches rather than instructions, because the instruction counter is
not accurate, while the branch one is). This alone causes a problem
for the present case, because glibc branches on the xcr0 value before
it branches on the cpuid value for AVX512. Glibc does check for the
correct cpuid before calling xgetbv, so one possible thing to do is to
completely disable xsave during recording by disabling it in CPUID, but
that would make rr quite a bit less useful, since it wouldn't be able to
record any bugs that require AVX to be used. However, the xsave
problem is worse, because xcr0 determines how much memory
`xsave` writes, so if we emulate cpuid, to pretend that AVX512
does not exist, and the user space application uses that to
determine the size of the required buffer, we now suddenly
overflow that buffer (unless the user space application uses
cpuid to compute a minimal RFBM for xsave, which no application
seems to do).

> Do memory contents which are never read by the application matter?

In theory, no. However, in practice, I've seen most memory
divergences (esp if on the stack), end up causing control flow divergences
down the line, because some code somewhere picks up the uninitialized
memory and branches on it.

Hope that helps,
Keno