Re: ia32_sysenter_target does not preserve EFLAGS
From: Ingo Molnar
Date: Sat Mar 28 2015 - 05:46:48 EST
* Denys Vlasenko <vda.linux@xxxxxxxxxxxxxx> wrote:
> On Fri, Mar 27, 2015 at 9:00 PM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > On Fri, Mar 27, 2015 at 7:25 AM, Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
> >>
> >> Apparently, users *don't* depend on arithmetic flags
> >> to survive over syscall. They also okay with DF flag
> >> being cleared.
> >
> > Generally, users probably dont' care about many registers at all being
> > saved, but it's worth noting that the reason system calls save/restore
> > even caller-saved registers is at least partly in order to avoid any
> > kernel information leaks.
> >
> > I don't believe that user mode will ever reasonably care about the
> > arithmetic flags being changed, but at the same time I also don't it
> > is something we should ever consider a "feature" we should try to take
> > advantage of. Generally we should try to not mess with the flag state,
> > and I'd *much* rather make the rule be that all the system call return
> > paths restore flags as much as possible.
>
> "We don't clobber anything" ABI has its appeal.
> OTOH, fulfilling ABI's promises has cost which hast to be paid
> on every syscall, regardless whether userspace needed it or not.
>
> Example. This is the uclibc implementation of write():
>
> 00000000004acfc4 <__libc_write>:
> 4acfc4: 53 push %rbx
> 4acfc5: 48 63 ff movslq %edi,%rdi
> 4acfc8: b8 01 00 00 00 mov $0x1,%eax
> 4acfcd: 0f 05 syscall
> 4acfcf: 48 89 c3 mov %rax,%rbx
> 4acfd2: 48 81 fb 00 f0 ff ff cmp $0xfffffffffffff000,%rbx
> 4acfd9: 76 0f jbe 4acfea <__libc_write+0x26>
> 4acfdb: e8 64 15 00 00 callq 4ae544 <__GI___errno_location>
> 4acfe0: 89 da mov %ebx,%edx
> 4acfe2: f7 da neg %edx
> 4acfe4: 89 10 mov %edx,(%rax)
> 4acfe6: 48 83 c8 ff or $0xffffffffffffffff,%rax
> 4acfea: 5b pop %rbx
> 4acfeb: c3 retq
>
> This is a C function. [...]
Arguably that's a self-inflicted wound of uclibc: nothing keeps it
from taking advantage of the syscall ABI and avoiding the double
save/restores.
> [...] Therefore any its caller assumes that C-clobbered registers
> can be, indeed, clobbered here, so if that caller uses any of them,
> it saves/restores them.
>
> All efforts by kernel code to save/restore C-clobbered registers,
> eight of them, are in vain. It's just useless work. Userspace does
> not benefit from that effort.
That's true only in this particular uclibc case, where user-space
decided to not take advantage of the save/restore property of the
kernel.
> If our syscall ABI would say that those regs are not preserved, we
> could have a bit faster syscalls. Any userspace code which really
> had to have those registers preserved across a particular syscall,
> could push/pop them itself.
We'd at minimum have to zero out the registers to avoid the
information leak and at that point it's in fact faster to just
save/restore in the syscall and allow user-space to take advantage of
that, if it wishes to.
We cannot do it the other way around.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/