On VMX, I think the tracepoint can be moved below the VMWRITEs without much
contention (though doing so is likely a nop), but moving it below
kvm_load_guest_xsave_state() requires a bit more discussion.
I 100% agree that the current behavior can be a bit confusing, but I wonder if
we'd be better off "solving" that problem through documentation.