O_LARGEFILE / EOVERFLOW on tmpfs / NFS

From: Thomas Weißschuh
Date: Tue Sep 20 2022 - 07:12:12 EST


Hi everybody,

it seems there is some inconsistency about how large files that are opened
*without* O_LARGEFILE on different filesystems.

On ext4/btrfs/xfs a large file openend without O_LARGEFILE results in an
EOVERFLOW error to be reported (as documented by open(2) and open(3p)).
On tmpfs/NFS the file is opened successfully but the values returned for
lseek() are bogus.
(See the reproducer attached to this mail.)

This has been reproduced on 5.19.8 but the sources look the same on current
torvalds/master.

Is this intentional? To me it seems this should fail with EOVERFLOW everywhere.
Looking at the sources, the O_LARGEFILE flag is checked in generic_file_open()
but not all filesystems call this function.

If this is a bug would it make sense to hoist this check into the VFS layer so
not all filesystems have to call this manually?
Another question would be about backwards-compatibility becaus fixing it would
prevent applications from opening files they could open before.
On the other hand they could have experienced silent data corruption before.

Thanks,
Thomas
/*
* Compile:
* cc -m32 test.c -o test
*
* Prepare testfile:
* fallocate -l 4294967297 large-file
*
* Test:
* ./test large-file
*
* Result:
* Correct: open() fails, exit code 2
* Incorrect: Prints an incorrect file size
*
* Observation:
* Correct on ext4/btrfs
* Incorrect on NFS/tmpfs
*/

#include <assert.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>

static_assert(sizeof(void *) == 4, "This test only makes sense on 32bit");
static_assert(sizeof(off_t) == 4, "Large file support has to be disabled");

int main(int argc, char **argv)
{
if (argc != 2)
return 1;
int fd = open(argv[1], O_RDONLY);
if (fd == -1)
return 2;
off_t fsize = lseek(fd, 0, SEEK_END);
printf("file size=%lu\n", fsize);
}