Re: [PATCH v3 1/2] tools/nolibc: fcntl: Add fallocate()

From: Thomas Weißschuh

Date: Thu May 14 2026 - 07:52:10 EST


Hi David,

sorry for the long delay.

On 2026-05-03 23:38:52+0100, David Laight wrote:
> On Sun, 3 May 2026 18:28:39 +0200
> Thomas Weißschuh <linux@xxxxxxxxxxxxxx> wrote:
>
> > On 2026-05-02 22:26:07+0100, David Laight wrote:
> > > On Sat, 2 May 2026 05:00:06 +0200
> > > Willy Tarreau <w@xxxxxx> wrote:
> > >
> > > > On Fri, May 01, 2026 at 09:18:31AM +0100, David Laight wrote:
> > > > > On Fri, 1 May 2026 01:41:24 +0900
> > > > > Daniel Palmer <daniel@xxxxxxxxx> wrote:
> >
> > (...)
> ...
> > > Looking again, the code is trying to copy what the compiler generates
> > > when it passes a 64bit quantity on stack or in 2 registers.
> > > So f(a, b, c, d) is traditionally 'push d; push c; push b, push a; call f'.
> > > With the normal 'stack grows down' this gives (in increasing addresses):
> > > return address
> > > a
> > > b
> > > c
> > > d
> > > If 'b,c' is replaced by a 64bit value then you want the stack memory
> > > to contain the correct representation of a 64bit value.
> > > So for LE you need to pass the low part before the high part.
> >
> > Most architectures should use registers and not the stack.
>
> Indeed, but the registers are picked as though they are caching on-stack locations.
> So the way the stack would look is relevant.

So I was a bit confused about the 64-bit arguments.
When I worked on llseek(), the system call definition explicitly splits
the 64-bit arguments into two 32-bit ones. For fallocate, that split
happens implicitly according to the function-call ABI.
I still don't see, where the stack layout is supposed to come into play.

> > > But there is one architecture (parisc) where the stack goes the other way.
> > > It is BE (I believe), but would need the LE argument order.
> > >
> > > I think all the conditionals could be moved out of the fallocate code.
> > > I'm not sure of the exact pre-processor conditionals but something like:
> > >
> > > #if (__BITS_PER_LONG == 64) || defined(x86_x32) || defined(mips_n32)
> > > #define __NOLIBC_ARG64(x) (x)
> > > #else
> > > /* The on-stack data has to be in the natural order for a 64bit value. */
> > > #if (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN_) ^ defined(STACK_GROWS_UP)
> > > #define __NOLIBC_ARG64(x) ((int)(x)), ((int)((long long)(x) >> 32))
> > > #else
> > > #define __NOLIBC_ARG64(x) ((int)((long long)(x) >> 32)), ((int)(x))
> > > #endif
> > > #endif
> >
> > I highly dislike macros like these which may expand to either a single
> > or multiple arguments.
>
> > Also this logic looks much more complicated than what Daniel proposed.
>
> I'm not sure the casts are needed.
> The implicit casts from the function calls may mean that it is enough
> to use (x) and (x) >> 32.
>
> I really didn't like the expansions Daniel proposed in sys_fallocate() itself.
> Neither the fallocate code nor the header file really said what was going on.
> You had to read both together and then read between the lines so see why
> it was necessary at all.

We should have a longer explanatory comment next to the macros which
explains them.

> > > Then (if I've got it right):
> > > static __attribute__((unused))
> > > int _sys_fallocate(int fd, int mode, off_t offset, off_t size)
> > > {
> > > return _syscall(__NR_fallocate, fd, mode, __NOLIBC_ARG64(offset),
> > > __NOLIBC_ARG64(size));
> > > }
> >
> > We intentionally use __nolibc_syscallX() inside nolibc proper over
> > _syscall()/syscall() to have the arity of the syscall written out.
>
> In this case the number of arguments varies - so that wouldn't work.
> But I can see why it would normally be better.
>
> >
> > > There is the other problem that arm32 (and maybe others) requires the 64bit
> > > variable be aligned in the stack frame.
> > > So f(int a, long long b) is effectively f(int a, int pad, long long b).
> > > (This usually causes grief with lseek().)
> > > So a pad might be needed for some system calls on some 32bit architectures.
> > > This could be done with something that expands to '' or '0,'.
> >
> > I don't see why we have to worry about variable alignment or stack
> > direction at all. That's the compilers job.
>
> Except you do because you are trying to match what the compiler would
> generate for a 64bit argument by passing two 32bit values.

I still can't follow.

> I'm pretty sure that the normal syscall interface just passes the registers
> that contain arguments over from user space to kernel space and then calls
> the kernel function (ie without any regard for the types of the parameters).
> This is normally fine except for lseek() and mmap() - they may have an
> interface that re-orders the arguments.

Yes, but if we explain that our custom macros do the same as the
implicit argument splitting, and test that it works, that should be
clear enough.

> > We do have both lseek() and
> > parisc32 nowadays and both work fine. Also with Daniel's series on top.
> > Am I missing something?
>
> godbolt doesn't have a parisc32 compiler - so I couldn't check.
> But I think if you compare the code for calls to f1(int, int, int, int),
> f2(int, long long, int) and f3(int, int long long) for f1(1,2,3,4),
> f2(1, 2ull<<32 | 3, 4) and f3(1, 2, 3ull<<32 | 4) against another BE
> system the 64bit constants will overlay different 32bit ones.

There are parisc compilers here:
https://mirrors.edge.kernel.org/pub/tools/crosstool/


Thomas