Re: Compat 32-bit syscall entry from 64-bit task!?

From: Denys Vlasenko
Date: Fri Jan 20 2012 - 16:52:10 EST


On Thu, Jan 19, 2012 at 1:38 AM, H. Peter Anvin <hpa@xxxxxxxxx> wrote:
> On 01/18/2012 03:28 PM, Linus Torvalds wrote:
>> On Wed, Jan 18, 2012 at 1:53 PM, H. Peter Anvin <hpa@xxxxxxxxx> wrote:
>>>
>>> I think we can obviously agree that regsets is the only way to go for
>>> any kind of new state.
>>
>> So I really don't necessarily agree at all.
>>
>> Exactly because there is a heavy burden to introducing new models.
>> It's not only relatively much more kernel code, it's also relatively
>> much more painful for user code. If we can hide it in existing
>> structures, user code is *much* better off, because any existing code
>> to get the state will just continue to work. Otherwise, you need to
>> have the code to figure out the new structures (how do you compile it
>> without the new kernel headers?), you need to do the extra accesses
>> conditionally etc etc.
>>
>> There's a real cost to introducing new interfaces. There's a *reason*
>> people try to make do with old ones.
>
> Of course.  However, the whole point with regsets is that at the very
> least the vast majority of the infrastructure is generic and extends
> without a bunch of new machine.  What you are saying is "we might be
> able to get away with existing state", what I'm saying is "if we add
> state it should be a regset".
>
> The question if this should be new state is currently open.  I
> personally would still would prefer if this didn't overlay real CPU state.

What about extending of one of the GETREGSET layouts?
GETREGSET uses struct iovec. struct iovec has buf_len.
Currently, if buf_len is larger than the register structure
being requested, kernel simply returns less data than
userspace asks for.

In the x86 case, we can add additional field(s) at
the end of NT_PRSTATUS layout.

Old programs which use PTRACE_GETREGS will get
old user_regs_struct layout (without appended fields).
Old programs which use
PTRACE_GETREGSET(NT_PRSTATUS, sizeof(struct user_regs_struct))
will also get the same.
New programs which use
PTRACE_GETREGSET(NT_PRSTATUS, sizeof(struct user_regs_struct) + N *
sizeof(long))
will get new fields too.

It's more intrusive than Linus' solution, but it avoids
the problem of overlaying real register data
with OS-specific special bits. It can also be employed
on other architectures (does not depend on having
a suitable register to abuse).

OTOH it is less intrusive than adding a whole new regset
just in order to add a few bits to an exiting one;
and would allow strace to extract both registers
and this new data with one operation instead of two.

Please see attached patch. NOT TESTED.

I'm new to this machinery, thus I might be missing some
obvious flaw with this idea (such as breaking on-disk
coredump format?)

--
vda
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 5026738..16455c0 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -419,6 +419,10 @@ static int putreg(struct task_struct *child,
if (child->thread.gs != value)
return do_arch_prctl(child, ARCH_SET_GS, value);
return 0;
+
+ case sizeof(struct user_regs_struct) + 0 * sizeof(long):
+ /* Modifying of thread_info->status is not allowed */
+ return 0;
#endif
}

@@ -469,6 +473,10 @@ static unsigned long getreg(struct task_struct *task, unsigned long offset)
return 0;
return get_desc_base(&task->thread.tls_array[GS_TLS]);
}
+
+ case sizeof(struct user_regs_struct) + 0 * sizeof(long):
+ /* One day we might want to expose other bits too */
+ return (task_thread_info(task)->status & TS_COMPAT);
#endif
}

@@ -1203,7 +1211,7 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
static struct user_regset x86_64_regsets[] __read_mostly = {
[REGSET_GENERAL] = {
.core_note_type = NT_PRSTATUS,
- .n = sizeof(struct user_regs_struct) / sizeof(long),
+ .n = (sizeof(struct user_regs_struct) + 1 * sizeof(long)) / sizeof(long),
.size = sizeof(long), .align = sizeof(long),
.get = genregs_get, .set = genregs_set
},