Re: [PATCH 1/2] x86/arch_prctl: add ARCH_SET_{COMPAT,NATIVE} to change compatible mode

From: Andy Lutomirski
Date: Thu Apr 14 2016 - 14:28:02 EST


On Wed, Apr 13, 2016 at 9:55 AM, Dmitry Safonov <dsafonov@xxxxxxxxxxxxx> wrote:
> On 04/08/2016 11:44 PM, Andy Lutomirski wrote:
>>
>> Feel free to ask for help on some of these details. user_64bit_mode
>> will be helpful too.
>
> Hello again,
>
> here are some questions on TIF_IA32 removal:
> - in function intel_pmu_pebs_fixup_ip: there is need to
> know if process was it native/compat mode for instruction
> interpreter for IP + one instruction fixup. There are
> registers, but they are from PEBS, which does not contain
> segment descriptors (even for PEBSv3). Other values
> are from interrupt regs (look at setup_pebs_sample_data).
> So, I guess, we may use user_64bit_mode on interrupt
> register set, which will be racy with changing task's mode,
> but quite ok?

Here's my understanding:

We don't actually know the mode, and there's no way we could get it
exactly. User code could have changed the mode between when the PEBS
event was written and when we got the interrupt, and there's no way
for us to tell.

The regs passed to the interrupt aren't particularly helpful -- if we
get the overflow event from kernel mode, the regs will be kernel regs,
not user regs.

What we can do is to the the regs returned by perf_get_regs_user,
which I imagine perf is already doing. Peter, is this the case?

If necessary, starting in 4.6, I could make the regs->cs part of
perf_get_regs_user be correct no matter what -- the only funny cases
left are NMI-in-system-call-prologue (there can't be intervening
interrupts any more other than MCE, and I don't think we really care
if we report correct PEBS results if we take a machine check in the
middle).

> - the same with LBR branching: I may got cs value for
> user_64bit_mode or all registers set from intel_pmu_handle_irq
> and pass it through intel_pmu_lbr_read => intel_pmu_lbr_filter
> to branch_type for instruction decoder, which may
> missinterpret opcode for the same racy-mode-switching app.
> Is it also fine?

Same thing, I think.

> - for coredumping/ptracing, I will change test_thread_flag(TIF_IA32)
> by user_64bit_mode(task_pt_regs()) - that looks/should be simple.
> It's also valid as at the moment of coredump or of
> PTRACE_GETREGSET task isn't running.

Please cc Oleg Nesterov on that one.

> - I do not know what to do with uprobes - as you noted,
> the way it cheks ia32_compat is buggy even now: task that
> switches CS to __USER32_CS or back to __USER_CS will have
> lousy inserted uprobe in mm.

I have no idea, but I'll look at your patch and maybe have an idea.
Oleg Nesterov might be a good person to ask about that, too.

> So, how do we know on insert-time, with which descriptor
> will be program on uprobed code?
> - for MPX, I guess, tracking which syscall called
> mpx_enable_management will work, at least it may be
> documented, that before switching, one need to disable mpx.

You already have to disable MPX before switching because of hardware
issues, so I wouldn't worry about it.

> - perf_reg_abi everywhere is used with current, so it's
> also simple-switching to user_64bit_mode(task_pt_regs(current)).

perf_reg_abi should be better -- see the fancy code in
perf_regs_get_user, which is where it comes from these days, I think.

>
> For the conclusion:
> I will send those patches, but I do not know what to do with
> uprobes tracing. Could you give an advice what to do with
> that?
> It seems like, if I do those things, I will only need a way to
> change vdso blob, without swapping some compatible flags,
> as 64-bit tasks will differ from 32-bit only by the way they
> execute syscalls.

Fantastic!

--Andy


--
Andy Lutomirski
AMA Capital Management, LLC