Re: [PATCH -V14 0/11] Generic name to handle and open by handle syscalls

From: Aneesh Kumar K. V
Date: Thu Jul 01 2010 - 12:29:15 EST


On Tue, 15 Jun 2010 22:42:50 +0530, "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxxxxxxx> wrote:

Hi Al,

Any chance of getting this reviewed/merged in the next merge window ?

-aneesh

> Hi,
>
> The below set of patches implement open by handle support using exportfs
> operations. This allows user space application to map a file name to file
> handle and later open the file using handle. This should be usable
> for userspace NFS [1] and 9P server [2]. XFS already support this with the ioctls
> XFS_IOC_PATH_TO_HANDLE and XFS_IOC_OPEN_BY_HANDLE.
>
> [1] http://nfs-ganesha.sourceforge.net/
> [2] http://thread.gmane.org/gmane.comp.emulators.qemu/68992
>
> git repo for the patchset at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/kvaneesh/linux-open-handle.git open-by-handle-v14
>
> Changes from V13:
> a) Add support for file descriptor to handle conversion. This is needed
> so that we find the right file handle for newly created files.
>
> Changes from V12:
> a) Use CAP_DAC_READ_SEARCH instead of CAP_DAC_OVERRIDE in open_by_handle
> b) Return -ENOTDIR if O_DIRECTORY flag is specified in open_by_handle with
> handle for non directory
>
> Changes from V11:
> a) Add necessary documentation to different functions
> b) Add null pathname support to faccessat and linkat similar to
> readlinkat.
> c) compile fix on x86_64
>
> Changes from V10:
> a) Missed an stg refresh before sending out the patchset. Send
> updated patchset.
>
> Changes from V9:
> a) Fix compile errors with CONFIG_EXPORTFS not defined
> b) Return -EOPNOTSUPP if file system doesn't support fh_to_dentry exportfs callback.
>
> Changes from V8:
> a) exportfs_decode_fh now returns -ESTALE if export operations is not defined.
> b) drop get_fsid super_operations. Instead use superblock to store uuid.
>
> Changes from V7:
> a) open_by_handle now use mountdirfd to identify the vfsmount.
> b) We don't validate the UUID passed as a part of file handle in open_by_handle.
> UUID is provided as a part of file handle as an easy way for userspace to
> use the kernel returned handle as it is. It also helps in finding the 16 byte
> filessytem UUID in userspace without using file system specific libraries to
> read file system superblock. If a particular file system doesn't support UUID
> or any form of unique id this field in the file handle will be zero filled.
> c) drop freadlink syscall. Instead use readlinkat with NULL pathname to indicate
> read the link target name of the link pointed by fd. This is similar to
> sys_utimensat
> d) Instead of opencoding all the open flag related check use helper functions.
> Did finish_open_by_handle similar to finish_open.
> c) Fix may_open to not return ELOOP for symlink when we are called from handle open.
> open(2) still returns error as expected.
>
> Changes from V6:
> a) Add uuid to vfsmount lookup and drop uuid to superblock lookup
> b) Return -EOPNOTSUPP in sys_name_to_handle if the file system returned uuid
> doesn't give the same vfsmount on lookup. This ensure that we fail
> sys_name_to_handle when we have multiple file system returning same UUID.
>
> Changes from V5:
> a) added sys_name_to_handle_at syscall which takes AT_SYMLINK_NOFOLLOW flag
> instead of two syscalls sys_name_to_handle and sys_lname_to_handle.
> b) addressed review comments from Niel Brown
> c) rebased to b91ce4d14a21fc04d165be30319541e0f9204f15
> d) Add compat_sys_open_by_handle
>
> Chages from V4:
> a) Changed the syscal arguments so that we don't need compat syscalls
> as suggested by Christoph
> c) Added two new syscall sys_lname_to_handle and sys_freadlink to work with
> symlinks
> d) Changed open_by_handle to work with all file types
> e) Add ext3 support
>
> Changes from V3:
> a) Code cleanup suggested by Andreas
> b) x86_64 syscall support
> c) add compat syscall
>
> Chages from V2:
> a) Support system wide unique handle.
>
> Changes from v1:
> a) handle size is now specified in bytes
> b) returns -EOVERFLOW if the handle size is small
> c) dropped open_handle syscall and added open_by_handle_at syscall
> open_by_handle_at takes mount_fd as the directory fd of the mount point
> containing the file
> e) handle will only be unique in a given file system. So for an NFS server
> exporting multiple file system, NFS server will have to internally track the
> mount point to which a file handle belongs to. We should be able to do it much
> easily than expecting kernel to give a system wide unique file handle. System
> wide unique file handle would need much larger changes to the exportfs or VFS
> interface and I was not sure whether we really need to do that in the kernel or
> in the user space
> f) open_handle_at now only check for DAC_OVERRIDE capability
>
>
> Example program: (x86_32). (x86_64 would need a different syscall number)
> -------
> cc <source.c> -luuid
> --------
> #define _GNU_SOURCE
> #include <stdio.h>
> #include <stdlib.h>
>
> #include <fcntl.h>
> #include <unistd.h>
> #include <errno.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <string.h>
> #include <uuid/uuid.h>
>
> struct file_handle {
> int handle_size;
> int handle_type;
> uuid_t fs_uuid;
> unsigned char handle[0];
> };
>
> #define AT_FDCWD -100
> #define AT_SYMLINK_FOLLOW 0x400
>
> static int name_to_handle(const char *name, struct file_handle *fh)
> {
> return syscall(338, AT_FDCWD, name, fh, AT_SYMLINK_FOLLOW);
> }
>
> static int lname_to_handle(const char *name, struct file_handle *fh)
> {
> return syscall(338, AT_FDCWD, name, fh, 0);
> }
>
> static int fd_to_handle(int fd, struct file_handle *fh)
> {
> return syscall(338, fd, NULL, fh, AT_SYMLINK_FOLLOW);
> }
>
> static int open_by_handle(int mountfd, struct file_handle *fh, int flags)
> {
> return syscall(339, mountfd, fh, flags);
> }
>
> #define BUFSZ 100
> int main(int argc, char *argv[])
> {
> int fd;
> int ret, done = 0;
> int mountfd;
> int handle_sz;
> struct stat bufstat;
> char buf[BUFSZ];
> char uuid[36];
> struct file_handle *fh = NULL;;
> if (argc != 3 ) {
> printf("Usage: %s <filename> <mount-dir-name>\n", argv[0]);
> exit(1);
> }
> again:
> if (fh && fh->handle_size) {
> handle_sz = fh->handle_size;
> free(fh);
> fh = malloc(sizeof(struct file_handle) + handle_sz);
> fh->handle_size = handle_sz;
> } else {
> fh = malloc(sizeof(struct file_handle));
> fh->handle_size = 0;
> }
> errno = 0;
> ret = lname_to_handle(argv[1], fh);
> if (ret && errno == EOVERFLOW) {
> printf("Found the handle size needed to be %d\n", fh->handle_size);
> goto again;
> } else if (ret) {
> perror("Error:");
> exit(1);
> }
> do_again:
> uuid_unparse(fh->fs_uuid, uuid);
> printf("UUID:%s\n", uuid);
> printf("Waiting for input");
> getchar();
> mountfd = open(argv[2], O_RDONLY | O_DIRECTORY);
> if (mountfd <= 0) {
> perror("Error:");
> exit(1);
> }
> fd = open_by_handle(mountfd, fh, O_RDONLY);
> if (fd <= 0 ) {
> perror("Error:");
> exit(1);
> }
> printf("Reading the content now \n");
> fstat(fd, &bufstat);
> ret = S_ISLNK(bufstat.st_mode);
> if (ret) {
> memset(buf, 0 , BUFSZ);
> readlinkat(fd, NULL, buf, BUFSZ);
> printf("%s is a symlink pointing to %s\n", argv[1], buf);
> }
> memset(buf, 0 , BUFSZ);
> while (1) {
> ret = read(fd, buf, BUFSZ -1);
> if (ret <= 0)
> break;
> buf[ret] = '\0';
> printf("%s", buf);
> memset(buf, 0 , BUFSZ);
> }
> /* Now check for faccess */
> if (faccessat(fd, NULL, W_OK, 0) == 0) {
> printf("Got write permission on the file \n");
> } else
> perror("faccess error");
> /* now try to create a hardlink */
> if (linkat(fd, NULL, AT_FDCWD, "test", 0) == 0){
> printf("created hardlink\n");
> } else
> perror("linkat error");
> if (done)
> exit(0);
> printf("Map fd to handle \n");
> ret = fd_to_handle(fd, fh);
> if (ret) {
> perror("Error:");
> exit(1);
> }
> done = 1;
> goto do_again;
> }
>
> -aneesh
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/