Re: [PATCH] fs: RESOLVE_CACHED final path component fix

From: Andreas Grünbacher
Date: Thu Nov 09 2023 - 17:12:53 EST


Am Do., 9. Nov. 2023 um 23:00 Uhr schrieb Al Viro <viro@xxxxxxxxxxxxxxxxxx>:
> On Thu, Nov 09, 2023 at 08:08:44PM +0100, Andreas Gruenbacher wrote:
> > Jens,
> >
> > since your commit 99668f618062, applications can request cached lookups
> > with the RESOLVE_CACHED openat2() flag. When adding support for that in
> > gfs2, we found that this causes the ->permission inode operation to be
> > called with the MAY_NOT_BLOCK flag set for directories along the path,
> > which is good, but the ->permission check on the final path component is
> > missing that flag. The filesystem will then sleep when it needs to read
> > in the ACL, for example.
> >
> > This doesn't look like the intended RESOLVE_CACHED behavior.
> >
> > The file permission checks in path_openat() happen as follows:
> >
> > (1) link_path_walk() -> may_lookup() -> inode_permission() is called for
> > each but the final path component. If the LOOKUP_RCU nameidata flag is
> > set, may_lookup() passes the MAY_NOT_BLOCK flag on to
> > inode_permission(), which passes it on to the permission inode
> > operation.
> >
> > (2) do_open() -> may_open() -> inode_permission() is called for the
> > final path component. The MAY_* flags passed to inode_permission() are
> > computed by build_open_flags(), outside of do_open(), and passed down
> > from there. The MAY_NOT_BLOCK flag doesn't get set.
> >
> > I think we can fix this in build_open_flags(), by setting the
> > MAY_NOT_BLOCK flag when a RESOLVE_CACHED lookup is requested, right
> > where RESOLVE_CACHED is mapped to LOOKUP_CACHED as well.
>
> No. This will expose ->permission() instances to previously impossible
> cases of MAY_NOT_BLOCK lookups, and we already have enough trouble
> in that area.

True, lockdep wouldn't be happy.

> See RCU pathwalk patches I posted last cycle;

Do you have a pointer? Thanks.

> I'm
> planning to rebase what still needs to be rebased and feed the
> fixes into mainline, but that won't happen until the end of this
> week *AND* ->permission()-related part of code audit will need
> to be repeated and extended.
>
> Until then - no, with the side of fuck, no.
>