Re: [PATCH] fs: fix i_writecount on shmem and friends

From: David Herrmann
Date: Thu Mar 20 2014 - 07:13:59 EST


Hi

On Thu, Mar 13, 2014 at 1:37 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> We do, but how do you get anything even attempt deny_write_access() on
> those? And what would the semantics of that be, anyway?

I just discovered /proc/self/map_files/ (only enabled with
CONFIG_CHECKPOINT_RESTORE). Attached is an example to trigger an
i_writecount underrun on an anon-mapping with your recommended fix
applied. You can also find it at [1].

This example is a bit more complex, but basically does this:
- get a fresh, executable zero-mapping
- write an executable into the mapping
- get a read-only file-descriptor to the underlying shmem file via
/proc/self/map_files/
- execute that mapping via /proc/self/map_files/
- try to get a writable FD via /proc/self/map_files, this will fail
with ETXTBUSY _even though_ we still own a writable mapping

Ok, maybe this example is stupid, uses non-standard functionality
(CONFIG_CHECKPOINT_RESTORE) and is a very unlikely use-case, but it
shows that there _are_ ways to trigger this underrun. The hard part is
to get access to an executable file-descriptor that was created via
alloc_file() rather than open(). Once you got this, you can always
trigger the underrun by executing the file while you still have a
_single_ writable mapping which wasn't accounted for in i_writecount.

/proc/self/map_files/ is root only, so this is not a security problem.
I'm fine if you argue /proc/self/map_files/ is insecure and shouldn't
be used. Or maybe it must be non-executable. I'm just saying there
might always be new interfaces that give you a file-descriptor to
files allocated with alloc_file() rather than open(). And once you got
this FD, you can always execve() it via /proc and get
deny_write_access() this way. By initializing i_writecount on all
files, we make sure this never happens.

Anyhow, as I really want this fixed either way, please let me know how
to proceed. I can polish your patch and resend it if you don't intend
to apply it yourself.

Thanks
david

[1] https://gist.githubusercontent.com/dvdhrm/9661331/raw/b751cc7859267bb62af393ba3817b92c5225a372/gistfile1.txt
#define _GNU_SOURCE
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <unistd.h>

int main(int argc, char **argv)
{
char *args[64];
struct stat st;
sigset_t set;
char path[128];
void *map;
size_t size;
int sig, fd_exe, fd_map, fd;
pid_t pid;
ssize_t res;

/*
* If called with arguments, we just act as a sleeping process that
* waits for any signal to arrive.
*/
if (argc > 1) {
printf("dummy waiter init\n");
/* dummy waiter; SIGTERM terminates us anyway */
sigemptyset(&set);
sigaddset(&set, SIGTERM);
sigwait(&set, &sig);
printf("dummy waiter exit\n");
exit(0);
}

fd_exe = open("/proc/self/exe", O_RDONLY);
if (fd_exe < 0) {
printf("open(/proc/self/exe) failed: %m\n");
abort();
}

if (fstat(fd_exe, &st) < 0) {
printf("fstat(fd_exe) failed: %m\n");
abort();
}

/* page-align */
size = (st.st_size + 4095ULL) & ~4095ULL;

map = mmap(NULL, size, PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
if (map == MAP_FAILED) {
printf("mmap(MAP_ANON) failed: %m\n");
abort();
}

res = read(fd_exe, map, st.st_size);
if (res != st.st_size) {
if (res < 0)
printf("read(fd_exe) failed: %m\n");
else
printf("read(fd_exe) was truncated to %zu\n",
(size_t)res);
abort();
}

sprintf(path, "/proc/self/map_files/%lx-%lx",
(unsigned long)map,
(unsigned long)map + size);
fd_map = open(path, O_RDONLY);
if (fd_map < 0) {
printf("open(%s) failed: %m\n", path);
abort();
}

pid = fork();
if (pid < 0) {
printf("fork() failed: %m\n");
abort();
} else if (!pid) {
args[0] = argv[0];
args[1] = "dummy";
args[2] = NULL;
execve(path, args, NULL);
printf("execve() failed: %m\n");
abort();
}

/* sleep 100ms to make sure the execve() really worked */
usleep(100 * 1000ULL);

sprintf(path, "/proc/self/fd/%d", fd_map);
fd = open(path, O_RDWR);
if (fd < 0) {
printf("open(%s) failed: %m\n", path);
abort();
}

munmap(map, size);
close(fd_exe);

close(fd);
close(fd_map);
printf("exiting\n");

return 0;
}