RE: [PATCH v9 00/17] Enable FSGSBASE instructions

From: Metzger, Markus T
Date: Tue Dec 10 2019 - 03:27:42 EST


> > > The general kernel rule is that we don't break working applications.
> > > Other than that, we're allowed to change the ABI if existing working
> > > applications don't break. I can't tell whether you wrote a test that
> > > detects a behavior change or whether you wrote a test that tests
> > > behavior that gdb or other programs actually rely on.
> >
> > Well, that's a tough question. The test covers GDB's behavior on today's
> > systems. GDB itself does not actually rely on that behavior. That is, GDB
> > itself wouldn't break. You couldn't do all that you could do with it before,
> > though.
>
> GDB does rely on at least some behavior. If I tell gdb to call a
> function on my behalf, doesn't it save the old state, call the
> function, and then restore the state? Surely it expects the restore
> operation to actually restore the state.

It does. If we managed to break that, inferior calls in GDB would be
broken. Users who don't use inferior calls wouldn't know or care,
though. That's the point I was trying to make previously.


> It also helps that very, very few 64-bit applications use nonzero
> segments at all. They used to because of a kernel optimization to
> automatically load a segment if an FS or GSBASE less than 4GB was
> requested, but that's been gone for a while. Calling
> set_thread_area() at all in a 64-bit program requires considerable
> gymnastics, and distributions can and do disable modify_ldt() outright
> without significant ill effects.
>
> So we're mostly talking about compatibility with 32-bit programs and
> exotic users like Wine and DOSEMU.

I agree that this should mostly affect 32-bit programs.


> > > Certainly, with a 32-bit *gdb*, writing a nonzero value to FS or GS
> > > using ptrace should change the base accordingly. I think the current
> > > patches get this wrong.
> > >
> > > With a 64-bit gdb and a 32-bit inferior, in an ideal world, everything
> > > would work just like full 64-bit, since that's how the hardware works.
> >
> > Not sure what you mean. The h/w runs in compatibility mode and the
> > inferior cannot set the base directly, can it?
>
> I think there's a general impedance mismatch between gdb and the
> kernel/hw here. On Linux on a 64-bit machine, there's isn't really a
> strong concept of a "32-bit process" versus a "64-bit process". All
> tasks have 64-bit values in RAX, all tasks have R8-R15, all tasks have
> a GDT and an LDT, etc. "32-bit tasks" are merely tasks that happen to
> be running with a compatibility selector loaded into CS at the time.
> Tasks can and do switch freely between compatibility and long mode
> using LJMP or LRET. As far as I can tell, however, gdb doesn't really
> understand this and thinks that 32-bit tasks are their own special
> thing.
>
> This causes me real problems: gdb explodes horribly if I connect gdb
> to QEMU's gdbserver (qemu -s) and try to debug during boot when the
> inferior switches between 32-bit and long mode.
>
> As far as FSGSBASE goes, a "32-bit task" absolutely can set
> independent values in FS and FSBASE, although it's awkward to do so:
> the task would have to do a far transfer to long mode, then WRFSBASE,
> then far transfer back to compat mode. But this entire sequence of
> events could occur without entering the kernel at all, and the ptrace
> API should be able to represent the result. I think that, ideally, a
> 64-bit debugger would understand the essential 64-bitness of even
> compat tasks and work sensibly. I don't really expect gdb to be able
> to do this any time soon, though.

I guess the primary use-case would be an application that was originally
written for 32-bit and is being maintained since then. GDB is probably
64-bit in that case.


> > We had discussed this some time ago and proposed the following behavior: "
> > https://lore.kernel.org/lkml/1521481767-22113-15-git-send-email-
> chang.seok.bae@xxxxxxxxx/
> >
> > In a summary, ptracer's update on FS/GS selector and base
> > yields such results on tracee's base:
> > - When FS/GS selector only changed (to nonzero), fetch base
> > from GDT/LDT (legacy behavior)
> > - When FS/GS base (regardless of selector) changed, tracee
> > will have the base
> > "
>
> Indeed. But I never understood how this behavior could be implemented
> with the current ABI. As I understand it, gdb only ever sets the
> inferior register state by using a single ptrace() call to load the
> entire state, which means that the kernel does not know whether just
> FS is being written or whether FS and FSBASE are being written.

GDB writes the register state as soon as the user changes one of them.


> What actual ptrace() call does gdb use when a 64-bit gdb debugs a
> 64-bit inferior? How about a 32-bit inferior?

GDB uses GETREGS both for 64-bit and 32-bit inferiors. If GETREGS is
not available, it errors out on 64-bit and falls back to PEEKUSER on 32-bit.


> > The ptracer would need to read registers back after changing the selector
> > to get the updated base.
>
> What would the actual API be?

GETREGS and PEEKUSER.


> I think it could make sense to add a whole new ptrace() command to
> tell the tracee to, in effect, MOV a specified value to a segment
> register. This call would have the actual correct semantics in which
> it would return an error code if the specified value is invalid and
> would return 0 on success. And then a second ptrace() call could be
> issued to read out FSBASE or GSBASE if needed. Would this be useful?
> What gdb commands would invoke it?

Could SETREGS handle it based on the above proposal?


> > The only time when both change at the same time, then, is when registers
> > are restored after returning from an inferior call. And then, it's the base
> > we want to take priority since we previously ensured that the base is always
> > up-to-date.
>
> Right. But how does the kernel tell the difference?

The other times only one changes. Could the kernel compare the old and new
values for selector and base and detect if one or both change at the same time?

Regards,
Markus.
Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928