Re: [RFC][PATCH v2] procfs: Always expose /proc/<pid>/map_files/ and make it readable

From: Calvin Owens
Date: Fri Jan 30 2015 - 20:59:15 EST


On Thursday 01/29 at 17:30 -0800, Kees Cook wrote:
> On Tue, Jan 27, 2015 at 8:38 PM, Calvin Owens <calvinowens@xxxxxx> wrote:
> > On Monday 01/26 at 15:43 -0800, Andrew Morton wrote:
> >> On Tue, 27 Jan 2015 00:00:54 +0300 Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote:
> >>
> >> > On Mon, Jan 26, 2015 at 02:47:31PM +0200, Kirill A. Shutemov wrote:
> >> > > On Fri, Jan 23, 2015 at 07:15:44PM -0800, Calvin Owens wrote:
> >> > > > Currently, /proc/<pid>/map_files/ is restricted to CAP_SYS_ADMIN, and
> >> > > > is only exposed if CONFIG_CHECKPOINT_RESTORE is set. This interface
> >> > > > is very useful for enumerating the files mapped into a process when
> >> > > > the more verbose information in /proc/<pid>/maps is not needed.
> >>
> >> This is the main (actually only) justification for the patch, and it it
> >> far too thin. What does "not needed" mean. Why can't people just use
> >> /proc/pid/maps?
> >
> > The biggest difference is that if you do something like this:
> >
> > fd = open("/stuff", O_BLAH);
> > map = mmap(NULL, 4096, PROT_BLAH, MAP_SHARED, fd, 0);
> > close(fd);
> > unlink("/stuff");
> >
> > ...then map_files/ gives you a way to get a file descriptor for
> > "/stuff", which you couldn't do with /proc/pid/maps.
> >
> > It's also something of a win if you just want to see what is mapped at a
> > specific address, since you can just readlink() the symlink for the
> > address range you care about and it will go grab the appropriate VMA and
> > give you the answer. /proc/pid/maps requires walking the VMA tree, which
> > is quite expensive for processes with many thousands of threads, even
> > without the O(N^2) issue.
> >
> > (You have to know what address range you want though, since readdir() on
> > map_files/ obviously has to walk the VMA tree just like /proc/N/maps.)
> >
> >> > > > This patch moves the folder out from behind CHECKPOINT_RESTORE, and
> >> > > > removes the CAP_SYS_ADMIN restrictions. Following the links requires
> >> > > > the ability to ptrace the process in question, so this doesn't allow
> >> > > > an attacker to do anything they couldn't already do before.
> >> > > >
> >> > > > Signed-off-by: Calvin Owens <calvinowens@xxxxxx>
> >> > >
> >> > > Cc +linux-api@
> >> >
> >> > Looks good to me, thanks! Though I would really appreciate if someone
> >> > from security camp take a look as well.
> >>
> >> hm, who's that. Kees comes to mind.
> >>
> >> And reviewers' task would be a heck of a lot easier if they knew what
> >> /proc/pid/map_files actually does. This:
> >>
> >> akpm3:/usr/src/25> grep -r map_files Documentation
> >> akpm3:/usr/src/25>
> >>
> >> does not help.
> >>
> >> The 640708a2cff7f81 changelog says:
> >>
> >> : This one behaves similarly to the /proc/<pid>/fd/ one - it contains
> >> : symlinks one for each mapping with file, the name of a symlink is
> >> : "vma->vm_start-vma->vm_end", the target is the file. Opening a symlink
> >> : results in a file that point exactly to the same inode as them vma's one.
> >> :
> >> : For example the ls -l of some arbitrary /proc/<pid>/map_files/
> >> :
> >> : | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80403000-7f8f80404000 -> /lib64/libc-2.5.so
> >> : | lr-x------ 1 root root 64 Aug 26 06:40 7f8f8061e000-7f8f80620000 -> /lib64/libselinux.so.1
> >> : | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80826000-7f8f80827000 -> /lib64/libacl.so.1.1.0
> >> : | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80a2f000-7f8f80a30000 -> /lib64/librt-2.5.so
> >> : | lr-x------ 1 root root 64 Aug 26 06:40 7f8f80a30000-7f8f80a4c000 -> /lib64/ld-2.5.so
> >>
> >> afacit this info is also available in /proc/pid/maps, so things
> >> shouldn't get worse if the /proc/pid/map_files permissions are at least
> >> as restrictive as the /proc/pid/maps permissions. Is that the case?
> >> (Please add to changelog).
> >
> > Yes, the only difference is that you can follow the link as per above.
> > I'll resend with a new message explaining that and the deletion thing.
> >
> >> There's one other problem here: we're assuming that the map_files
> >> implementation doesn't have bugs. If it does have bugs then relaxing
> >> permissions like this will create new vulnerabilities. And the
> >> map_files implementation is surprisingly complex. Is it bug-free?
> >
> > While I was messing with it I used it a good bit and didn't see any
> > issues, although I didn't actively try to fuzz it or anything. I'd be
> > happy to write something to test hammering it in weird ways if you like.
> > I'm also happy to write testcases for namespaces.
> >
> > So far as security issues, as others have pointed out you can't follow
> > the links unless you can ptrace the process in question, which seems
> > like a pretty solid guarantee. As Cyrill pointed out in the discussion
> > about the documentation, that's the same protection as /proc/N/fd/*, and
> > those links function in the same way.
>
> My concern here is that fd/* are connected as streams, and while that
> has a certain level of badness as an external-to-the-process attacker,
> PTRACE_MODE_READ is much weaker than PTRACE_MODE_ATTACH (which is
> required for access to /proc/N/mem). Since these fds are the things
> mapped into memory on a process, writing to them is a subset of access
> to /proc/N/mem, and I don't feel that PTRACE_MODE_READ is sufficient.

If you haven't done close() on a mmapped file, doesn't fd/* allow the
same access to the corresponding regions of memory? Or am I missing
something?

But that said, I can't think of any reason making it MODE_ATTACH would
be a problem. Would you rather that be enforced on follow_link() like
the original patch did, or enforce it for the whole directory?

Thanks,
Calvin

> -Kees
>
> --
> Kees Cook
> Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/