Re: [PATCH 01/14] VFS: Add additional RESOLVE_* flags [ver #18]

From: Stefan Metzmacher
Date: Thu Mar 12 2020 - 13:11:27 EST


Am 12.03.20 um 17:24 schrieb Linus Torvalds:
> On Thu, Mar 12, 2020 at 2:08 AM Stefan Metzmacher <metze@xxxxxxxxx> wrote:
>>
>> The whole discussion was triggered by the introduction of a completely
>> new fsinfo() call:
>>
>> Would you propose to have 'at_flags' and 'resolve_flags' passed in here?
>
> Yes, I think that would be the way to go.

Ok, that's also fine for me:-)

>>> If we need linkat2() and friends, so be it. Do we?
>>
>> Yes, I'm going to propose something like this, as it would make the life
>> much easier for Samba to have the new features available on all path
>> based syscalls.
>
> Will samba actually use them? I think we've had extensions before that
> weren't worth the non-portability pain?

Yes, we're currently moving to the portable *at() calls as a start.
And we already make use of Linux only feature for performance reasons
in other places. Having the new resolve flags will make it possible to
move some of the performance intensive work into non-linux specific
modules as fallback.

I hope that we'll use most of this through io_uring in the end,
that's the reason Jens added the IORING_REGISTER_PERSONALITY feature
used for IORING_OP_OPENAT2.

> But yes, if we have a major package like samba use it, then by all
> means let's add linkat2(). How many things are we talking about? We
> have a number of system calls that do *not* take flags, but do do
> pathname walking. I'm thinking things like "mkdirat()"?)

I haven't looked them up in detail yet.
Jeremy can you provide a list?

Do you think we could route some of them like mkdirat() and mknodat()
via openat2() instead of creating new syscalls?

>> In addition I'll propose to have a way to specify the source of
>> removeat and unlinkat also by fd in addition to the the source parent fd
>> and relative path, the reason are also to detect races of path
>> recycling.
>
> Would that be basically just an AT_EMPTY_PATH kind of thing? IOW,
> you'd be able to remove a file by doing
>
> fd = open(path.., O_PATH);
> unlinkat(fd, "", AT_EMPTY_PATH);
>
> Hmm. We have _not_ allowed filesystem changes without that last
> component lookup. Of course, with our dentry model, we *can* do it,
> but this smells fairly fundamental to me.
>
> It might avoid some of the extra system calls (ie you could use
> openat2() to do the path walking part, and then
> unlinkat(AT_EMPTY_PATH) to remove it, and have a "fstat()" etc in
> between the verify that it's the right type of file or whatever - and
> you'd not need an unlinkat2() with resolve flags).

If that works safely for hardlinks and having another process doing a
rename between openat2() and unlinkat(), we could try that.

My initial naive idea was to have one syscall instead of
linkat2/renameat3/unlinkat2.

int xlinkat(int src_dfd, const char *src_path,
int dst_dfd, const char *dst_path,
const struct xlinkat_how *how, size_t how_size);

struct xlinkat_how {
__u64 src_at_flags;
__u64 src_resolve_flags;
__u64 dst_at_flags;
__u64 dst_resolve_flags;
__u64 rename_flags;
__s32 src_fd;
};

With src_dfd=-1, src_path=NULL, how.src_fd = -1, this would be like
linkat().
With dst_dfd=-1, dst_path=NULL, it would be like unlinkat().
Otherwise a renameat2().

If how.src_fd is not -1, it would be checked to be the same path as
specified by src_dfd and src_path.

> I think Al needs to ok this kind of change. Maybe you've already
> discussed it with him and I just missed it.

This is the first time I'm discussing this.

Thanks for the useful feedback!
metze


Attachment: signature.asc
Description: OpenPGP digital signature