Re: [RFC2 nowrap: PATCH v7 00/18] ILP32 for ARM64
From: Yury Norov
Date: Thu Aug 18 2016 - 05:45:52 EST
On Wed, Aug 17, 2016 at 04:26:42PM +0100, Catalin Marinas wrote:
> On Wed, Aug 17, 2016 at 04:32:23PM +0200, Dr. Philipp Tomsich wrote:
> > On 17 Aug 2016, at 16:29, Catalin Marinas <catalin.marinas@xxxxxxx> wrote:
> > > On Wed, Aug 17, 2016 at 02:54:59PM +0200, Dr. Philipp Tomsich wrote:
> > >> On 17 Aug 2016, at 14:48, Yury Norov <ynorov@xxxxxxxxxxxxxxxxxx> wrote:
> > >>> On Wed, Aug 17, 2016 at 02:28:50PM +0200, Alexander Graf wrote:
> > >>>> On 17 Aug 2016, at 13:46, Yury Norov <ynorov@xxxxxxxxxxxxxxxxxx> wrote:
> > >>>>> This series enables aarch64 with ilp32 mode, and as supporting work,
> > >>>>> introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> > >>>>> existing 32-bit architectures but disabled for new arches (so 64-bit
> > >>>>> off_t is is used by new userspace).
> > >>>>>
> > >>>>> This version is based on kernel v4.8-rc2.
> > >>>>> It works with glibc-2.23, and tested with LTP.
> > >>>>>
> > >>>>> This is RFC because there is still no solid understanding what type of registers
> > >>>>> top-halves delousing we prefer. In this patchset, w0-w7 are cleared for each
> > >>>>> syscall in assembler entry. The alternative approach is in introducing compat
> > >>>>> wrappers which is little faster for natively routed syscalls (~2.6% for syscall
> > >>>>> with no payload) but much more complicated.
> > >>>>
> > >>>> So youâre saying there are 2 options:
> > >>>>
> > >>>> 1) easy to get right, slightly slower, same ABI to user space as 2
> > >>>> 2) harder to get right, minor performance benefit
> > >>>
> > >>> No, ABI is little different. If 1) we pass off_t in a pair to syscalls,
> > >>> if 2) - in a single register. So if 1, we 'd take some wrappers from aarch32.
> > >>> See patch 12 here.
> > >>
> > >> From our experience with ILP32, Iâd prefer to have off_t (and similar)
> > >> in a single register whenever possible (i.e. option #2). It feels
> > >> more natural to use the full 64bit registers whenever possible, as
> > >> ILP32 on ARMv8 should really be understood as a 64bit ABI with a 32bit
> > >> memory model.
> > >
> > > I think we are well past the point where we considered ILP32 a 64-bit
> > > ABI. It would have been nice but we decided that breaking POSIX
> > > compatibility is a bad idea, so we went back (again) to a 32-bit ABI for
> > > ILP32. While there are 64-bit arguments that, at a first look, would
> > > make sense to be passed in 64-bit registers, the kernel maintenance cost
> > > is significant with changes to generic files.
> > >
> > > Allowing 64-bit wide registers at the ILP32 syscall interface means that
> > > the kernel would have to zero/sign-extend the upper half of the 32-bit
> > > arguments for the cases where they are passed directly to a native
> > > syscall that expects a 64-bit argument. This (a) adds a significant
> > > number of wrappers to the generic code together additional annotations
> > > to the generic unistd.h and (b) it adds a small overhead to the AArch32
> > > (compat) ABI since it doesn't need such generic wrapping (the upper half
> > > of 64-bit registers is guaranteed to be zero/preserved by the
> > > architecture when coming from the AArch32 mode).
> >
> > Yes, I remember the discussions and just wanted to put option #2 in
> > context again.
>
> I don't particularly like splitting 64-bit arguments in two 32-bit
> values either but I don't see a better alternative. To keep this
> mostly in the arch code we would need an additional table of syscall
> wrappers where the majority just use the default zero-extend everything
> with a few specific wrappers where we pass 64-bit arguments. Or we could
> set an extra bit in the syscall number for those syscalls that need
> special wrapping and avoid zero-extending. But neither of these look any
> nicer (well, maybe only from the user-space perspective).
>
This is the discussion started by David Miller
https://patchwork.kernel.org/patch/9132521/
After it we switched to current version.
> > Everything points to just going with the pair-of-registers and getting
> > this merged quickly then, I suppose.
>
> I will refrain from commenting on how quickly we merge this ;) (it may
> be seen as binding by some).
>
> --
> Catalin