Re: [RFC PATCH v2] x86/arch_prctl: Add ARCH_SET_XCR0 to set XCR0 per-thread
From: Andy Lutomirski
Date: Tue Apr 14 2020 - 19:23:36 EST
On Tue, Apr 7, 2020 at 11:30 AM Keno Fischer <keno@xxxxxxxxxxxxxxxxxx> wrote:
>
> > TSX!
>
> Yes, it's problematic, but luckily turns out to
> be ok in practice if masked off in cpuid.
>
> > I think rr should give the raw KVM API at least a try. It should be possible to fire up a vCPU in CPL3 in the correct state. No guest kernel required. I donât know if there will be issues with the perf API, though.
>
> Yes, I've looked into it, but stopped short of doing a
> complete implementation. Using KVM to solve it
> for replay would probably be feasible with a moderate
> amount of engineering work, since rr does very few
> syscalls during replay. I'm a bit afraid of the
> performance implications, but I don't have numbers on this.
>
> Record and diversions are a lot harder though, because
> in this mode the tracee is a live process and able to do
> syscalls (and needs to receive signals and all that good
> stuff associated with being a real process). For diversions,
> performance isn't super important, so we could probably
> emulate this, but for record, performance is quite critical.
> I assume it would be possible to add a feature to KVM
> where it forwards syscalls made in guest CPL3 to the real
> kernel without round-trip through userspace, but I'm just
> seeing myself back here asking
> for a weird KVM feature that nobody but me wants ;)
> (well almost nobody, as I mentioned, there's an
> academic project that tried this with a custom kernel
> plugin - http://dune.scs.stanford.edu/).
>
> Admittedly, the use case for this feature during record is
> less pressing, since in our (operational) case
> the replay machines tend to be much newer than
> the record machines, but I wouldn't be surprised if I got
> bit by this as soon as the next user xstate component gets
> added and users start sending me those kinds of traces,
> even if we mask off the feature in CPUID (which rr already
> supports for record for similar reasons).
I'm imagining that rr would do record the usual way with normal XCR0
(why would you want to record with an unusual XCR0?) and replay would
use KVM. I'm not sure about diversions. This way KVM wouldn't need
to deal with syscalls.
Would this work?