Re: [PATCH RFC] fhandle: expose u64 mount id to name_to_handle_at(2)

From: Aleksa Sarai
Date: Thu May 23 2024 - 11:53:01 EST


On 2024-05-21, Christian Brauner <brauner@xxxxxxxxxx> wrote:
> On Mon, May 20, 2024 at 05:35:49PM -0400, Aleksa Sarai wrote:
> > Now that we have stabilised the unique 64-bit mount ID interface in
> > statx, we can now provide a race-free way for name_to_handle_at(2) to
> > provide a file handle and corresponding mount without needing to worry
> > about racing with /proc/mountinfo parsing.
> >
> > As with AT_HANDLE_FID, AT_HANDLE_UNIQUE_MNT_ID reuses a statx AT_* bit
> > that doesn't make sense for name_to_handle_at(2).
> >
> > Signed-off-by: Aleksa Sarai <cyphar@xxxxxxxxxx>
> > ---
>
> So I think overall this is probably fine (famous last words). If it's
> just about being able to retrieve the new mount id without having to
> take the hit of another statx system call it's indeed a bit much to
> add a revised system call for this. Althoug I did say earlier that I
> wouldn't rule that out.
>
> But if we'd that then it'll be a long discussion on the form of the new
> system call and the information it exposes.
>
> For example, I lack the grey hair needed to understand why
> name_to_handle_at() returns a mount id at all. The pitch in commit
> 990d6c2d7aee ("vfs: Add name to file handle conversion support") is that
> the (old) mount id can be used to "lookup file system specific
> information [...] in /proc/<pid>/mountinfo".

The logic was presumably to allow you to know what mount the resolved
file handle came from. If you use AT_EMPTY_PATH this is not needed
because you could just fstatfs (and now statx(AT_EMPTY_PATH)), but if
you just give name_to_handle_at() almost any path, there is no race-free
way to make sure that you know which filesystem the file handle came
from.

I don't know if that could lead to security issues (I guess an attacker
could find a way to try to manipulate the file handle you get back, and
then try to trick you into operating on the wrong filesystem with
open_by_handle_at()) but it is definitely something you'd want to avoid.

> Granted, that's doable but it'll mean a lot of careful checking to avoid
> races for mount id recycling because they're not even allocated
> cyclically. With lots of containers it becomes even more of an issue. So
> it's doubtful whether exposing the mount id through name_to_handle_at()
> would be something that we'd still do.
>
> So really, if this is just about a use-case where you want to spare the
> additional system call for statx() and you need the mnt_id then
> overloading is probably ok.
>
> But it remains an unpleasant thing to look at.
>

Yeah, I agree it's ugly.

--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

Attachment: signature.asc
Description: PGP signature