Re: [LTP] [PATCH v4 0/5] Add fchmodat2 testing suite

From: Andrea Cervesato
Date: Fri Aug 02 2024 - 03:58:32 EST

Next message: syzbot: "[syzbot] [trace?] linux-next test error: WARNING in rcu_core"
Previous message: color Ice: "Ubuntu RT2X00 WIFI USB Driver Kernel NULL pointer Dereference&Use-After-Free Vulnerability"
In reply to: Petr Vorel: "Re: [LTP] [PATCH v4 0/5] Add fchmodat2 testing suite"
Next in thread: Aleksa Sarai: "Re: [LTP] [PATCH v4 0/5] Add fchmodat2 testing suite"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 8/2/24 09:49, Petr Vorel wrote:

Hi!
On 8/2/24 07:42, Petr Vorel wrote:

On 2024-08-01, Petr Vorel <pvorel@xxxxxxx> wrote:

Hi all,

This is a patch-set that implements fchmodat2() syscall coverage.
fchmodat2() has been added in kernel 6.6 in order to support
AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH in fchmodat().
There's no man pages yet, so please take the following links as
main documentation along with kernel source code:

I would hope that it'd be at least Christian's fork [1], but it's not there.
I suppose nobody is working on the man page.

https://www.phoronix.com/news/fchmodat2-For-Linux-6.6
https://lore.kernel.org/lkml/20230824-frohlocken-vorabend-725f6fdaad50@brauner/
***********
* WARNING *
***********
fchmodat2_02 fails with EOPNOTSUPP because of missing feature.

For a record, it's fchmodat2_01.c (from this patchset) which is failing (on
6.10.1-1.g4c78d6f-default Tumbleweed and 6.6.21-0-lts Alpine, both x86_64 VMs).
Andrea, I would personally just skip test on EOPNOTSUPP (that's what we do in
LTP on EOPNOTSUPP). The question why is not supported and whether is going to be
fixed.
Looking into glibc change 65341f7bbe ("linux: Use fchmodat2 on fchmod for flags
different than 0 (BZ 26401)") one year old change from glibc-2.39 [2] it looks
just accepted behavior (glibc returns EOPNOTSUPP on symlink):
+ /* Some Linux versions with some file systems can actually
+ change symbolic link permissions via /proc, but this is not
+ intentional, and it gives inconsistent results (e.g., error
+ return despite mode change). The expected behavior is that
+ symbolic link modes cannot be changed at all, and this check
+ enforces that. */
+ if (S_ISLNK (st.st_mode))
+ {
__close_nocancel (pathfd);
- return ret;
+ __set_errno (EOPNOTSUPP);
+ return -1;
+ }
Also musl also behaves the same on his fallback on old kernels [3]
(it started 10 years ago on 0dc48244 ("work around linux's lack of flags
argument to fchmodat syscall") when SYS_fchmodat was used and kept when this
year SYS_fchmodat2 started to be used in d0ed307e):
int ret = __syscall(SYS_fchmodat2, fd, path, mode, flag);
if (ret != -ENOSYS) return __syscall_ret(ret);
if (flag != AT_SYMLINK_NOFOLLOW)
return __syscall_ret(-EINVAL);
struct stat st;
int fd2;
char proc[15+3*sizeof(int)];
if (fstatat(fd, path, &st, flag))
return -1;
if (S_ISLNK(st.st_mode))
return __syscall_ret(-EOPNOTSUPP);

According to documentation, the feature has been implemented in
kernel 6.6, but __in reality__ AT_SYMLINK_NOFOLLOW is not working
on symbolic files. Also kselftests, which are meant to test the
functionality, are not working and they are treating fchmodat2()
syscall failure as SKIP. Please take a look at the following code
before reviewing:
https://github.com/torvalds/linux/blob/8f6a15f095a63a83b096d9b29aaff4f0fbe6f6e6/tools/testing/selftests/fchmodat2/fchmodat2_test.c#L123

I see there is a kselftest workaround in 4859c257d295 ("selftests: Add fchmodat2
selftest") [4], where fchmodat2 failure on symlink is simply skipped.
Aleksa, you're probably aware of this fchmodat2() failure on symlinks. Does
anybody work or plan to work on fixing it? LTP has policy to not cover kernel
bugs, if it's not expected to be working we might just skip the test as well.

If I understand the bug report, the issue is that fchmodat2() doesn't
work on symlinks?

Yes.

This is intentional -- Christian fixed a tree-wide bug a while ago[1]
where some filesystems would change the mode of symlinks despite
returning an error (usually EOPNOTSUPP) and IIRC a few others would
happily change the mode of symlinks.

Ah, I've seen this in the past. Thanks a lot for reminding me.

The current intended behaviour is to always return EOPNOTSUPP, and AFAIK
there is no plan to re-enable the changing of symlink modes. EOPNOTSUPP
was chosen because that's what filesystems were already returning.
(While this is a little confusing, VFS syscalls return EINVAL for an
unsupported flag, not EOPNOTSUPP.)
The benefit of an AT_SYMLINK_NOFOLLOW flag is not just to to allow a
syscall to operate on symlinks, it also allows programs to safely
operate on path components without worrying about symlinks being
followed (this is relevant for container runtimes, where we are
operating on untrusted filesystem roots -- though in the case of
fchmodat2(2) you would probably just use AT_EMPTY_PATH in practice). So
an error here is actually what you want as a program that uses
AT_SYMLINK_NOFOLLOW (since the actual operation is intentionally not
supported by filesystems).

Thanks for the explanation. I also have a question around this topic:
AT_SYMLINK_NOFOLLOW has been added in order to avoid symlinks being
followed. But if filesystems are not supporting it, why do we have this
feature?

@Aleksa please correct me if I'm wrong.

AFAIK (reading 5d1f903f75a8 commit message [1] and Aleksa's comments) previously
it was an idea which many of the filesystem implemented wrongly - a mess
regardless whether supported by the filesystem or not. I particularly like
changing the mode but fail EOPNOTSUPP. And because glibc and musl did EOPNOTSUPP
anyway (I found that as well), the best was just to follow this in kernel and
unify all filesystems behavior by disabling this in VFS.

Also, is it an unsupported feature only on certain filesystems?

Disabled in VFS => unsupported on all filesystems.

Ah right, that's quite obvious indeed.

Thanks a lot for explaining the background!

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5d1f903f75a80daa4dfb3d84e114ec8ecbf29956

I see a RFC UAPI related patchset [5] which touches include/uapi/linux/fcntl.h,
but AFAIK it's not related to this problem.

Yeah this is unrelated, that patch is about clarifying how AT_* flags
are allocated, not syscall behaviour.

Thanks!

Kind regards,
Petr

@Andrea, I guess we want something like this:
+++ testcases/kernel/syscalls/fchmodat2/fchmodat2_01.c
@@ -43,9 +43,10 @@ static void test_symbolic_link(void)
verify_mode(fd_dir, FNAME, S_IFREG | 0700);
verify_mode(fd_dir, SNAME, S_IFLNK | 0777);
- TST_EXP_PASS(fchmodat2(fd_dir, SNAME, 0640, AT_SYMLINK_NOFOLLOW));
- verify_mode(fd_dir, FNAME, S_IFREG | 0700);
- verify_mode(fd_dir, SNAME, S_IFLNK | 0640);
+ if (tst_kvercmp(6, 6, 0) >= 0) {
+ TST_EXP_FAIL(tst_syscall(__NR_fchmodat2, fd_dir, SNAME, 0640,
+ AT_SYMLINK_NOFOLLOW), EOPNOTSUPP);
+ }
}
static void test_empty_folder(void)

I think it makes more sense to filter out only filesystems which are not
supporting this feature (see my comment above).

Due disabled in VFS since 5d1f903f75a8 all filesystems fail with EOPNOTSUPP,
thus failure in LTP (TBROK), which needs to be handled. Before 5d1f903f75a8
some of them actually changed the mode (e.g. btrfs, ext4, xfs), but that's no
longer the case. And because it got backported to all stable/LTS, we can expect
this is the correct behavior.

Kind regards,
Petr

Best regards,
Andrea

I guess we can just assume EOPNOTSUPP for the test. I will add the .tag and errno check.

Thanks,
Andrea

Next message: syzbot: "[syzbot] [trace?] linux-next test error: WARNING in rcu_core"
Previous message: color Ice: "Ubuntu RT2X00 WIFI USB Driver Kernel NULL pointer Dereference&Use-After-Free Vulnerability"
In reply to: Petr Vorel: "Re: [LTP] [PATCH v4 0/5] Add fchmodat2 testing suite"
Next in thread: Aleksa Sarai: "Re: [LTP] [PATCH v4 0/5] Add fchmodat2 testing suite"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]