Re: [PATCH v4] ceph: bound encrypted snapshot suffix formatting

From: Ilya Dryomov

Date: Tue Apr 28 2026 - 08:08:16 EST


On Mon, Apr 27, 2026 at 10:12 AM Luis Henriques <luis@xxxxxxxxxx> wrote:
>
> On Fri, Apr 24 2026, Ilya Dryomov wrote:
>
> > On Fri, Apr 24, 2026 at 8:31 PM Viacheslav Dubeyko
> > <Slava.Dubeyko@xxxxxxx> wrote:
> >>
> >> On Fri, 2026-04-24 at 11:27 +0200, Ilya Dryomov wrote:
> >> > On Thu, Apr 23, 2026 at 8:04 PM Viacheslav Dubeyko
> >> > <Slava.Dubeyko@xxxxxxx> wrote:
> >> > >
> >> > > On Wed, 2026-04-22 at 11:53 +0200, Ilya Dryomov wrote:
> >> > > > On Fri, Apr 10, 2026 at 10:46 PM Viacheslav Dubeyko
> >> > > > <Slava.Dubeyko@xxxxxxx> wrote:
> >> > > > >
> >> > > > > On Fri, 2026-04-10 at 20:40 +0000, Viacheslav Dubeyko wrote:
> >> > > > > > On Thu, 2026-04-09 at 18:09 +0000, Viacheslav Dubeyko wrote:
> >> > > > > > > On Thu, 2026-04-09 at 10:39 +0800, Pengpeng Hou wrote:
> >> > > > > > > > ceph_encode_encrypted_dname() base64-encodes the encrypted snapshot
> >> > > > > > > > name into the caller buffer and then, for long snapshot names, appends
> >> > > > > > > > _<ino> with sprintf(p + elen, ...).
> >> > > > > > > >
> >> > > > > > > > Some callers only provide NAME_MAX bytes. For long snapshot names, a
> >> > > > > > > > large inode suffix can push the final encoded name past NAME_MAX even
> >> > > > > > > > though the encrypted prefix stayed within the documented 240-byte
> >> > > > > > > > budget.
> >> > > > > > > >
> >> > > > > > > > Format the suffix into a small local buffer first and reject names
> >> > > > > > > > whose suffix would exceed the caller's NAME_MAX output buffer.
> >> > > > > > > >
> >> > > > > > > > Signed-off-by: Pengpeng Hou <pengpeng@xxxxxxxxxxx>
> >> > > > > > > > ---
> >> > > > > > > > Changes since v3:
> >> > > > > > > > - reject `elen > 240` explicitly instead of relying only on the earlier
> >> > > > > > > > `WARN_ON()`
> >> > > > > > > > - rewrite the NAME_MAX bound check in terms of the final total length
> >> > > > > > > > instead of `NAME_MAX - prefix_len - elen`
> >> > > > > > > >
> >> > > > > > > > fs/ceph/crypto.c | 31 +++++++++++++++++++++++++++++--
> >> > > > > > > > 1 file changed, 29 insertions(+), 2 deletions(-)
> >> > > > > > > >
> >> > > > > > > > diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c
> >> > > > > > > > index f3de43ccb470..42e3fff34697 100644
> >> > > > > > > > --- a/fs/ceph/crypto.c
> >> > > > > > > > +++ b/fs/ceph/crypto.c
> >> > > > > > > > @@ -15,6 +15,12 @@
> >> > > > > > > > #include "mds_client.h"
> >> > > > > > > > #include "crypto.h"
> >> > > > > > > >
> >> > > > > > > > +/*
> >> > > > > > > > + * Reserve room for '_' + decimal 64-bit inode number + trailing NUL.
> >> > > > > > > > + * ceph_encode_encrypted_dname() copies only the visible suffix bytes.
> >> > > > > > > > + */
> >> > > > > > > > +#define CEPH_ENCRYPTED_SNAP_INO_SUFFIX_MAX sizeof("_18446744073709551615")
> >> > > > > > > > +
> >> > > > > > > > static int ceph_crypt_get_context(struct inode *inode, void *ctx, size_t len)
> >> > > > > > > > {
> >> > > > > > > > struct ceph_inode_info *ci = ceph_inode(inode);
> >> > > > > > > > @@ -209,6 +215,7 @@ int ceph_encode_encrypted_dname(struct inode *parent, char *buf, int elen)
> >> > > > > > > > struct inode *dir = parent;
> >> > > > > > > > char *p = buf;
> >> > > > > > > > u32 len;
> >> > > > > > > > + int prefix_len = 0;
> >> > > > > > > > int name_len = elen;
> >> > > > > > > > int ret;
> >> > > > > > > > u8 *cryptbuf = NULL;
> >> > > > > > > > @@ -219,6 +226,7 @@ int ceph_encode_encrypted_dname(struct inode *parent, char *buf, int elen)
> >> > > > > > > > if (IS_ERR(dir))
> >> > > > > > > > return PTR_ERR(dir);
> >> > > > > > > > p++; /* skip initial '_' */
> >> > > > > > > > + prefix_len = 1;
> >> > > > > > > > }
> >> > > > > > > >
> >> > > > > > > > if (!fscrypt_has_encryption_key(dir))
> >> > > > > > > > @@ -271,8 +279,27 @@ int ceph_encode_encrypted_dname(struct inode *parent, char *buf, int elen)
> >> > > > > > > >
> >> > > > > > > > /* To understand the 240 limit, see CEPH_NOHASH_NAME_MAX comments */
> >> > > > > > > > WARN_ON(elen > 240);
> >> > > > > > > > - if (dir != parent) // leading _ is already there; append _<inum>
> >> > > > > > > > - elen += 1 + sprintf(p + elen, "_%ld", dir->i_ino);
> >> > > > > > > > + if (elen > 240) {
> >> > > > > > > > + elen = -ENAMETOOLONG;
> >> > > > > > > > + goto out;
> >> > > > > > > > + }
> >> > > > > > > > +
> >> > > > > > > > + if (dir != parent) {
> >> > > > > > > > + int total_len;
> >> > > > > > > > + /* leading '_' is already there; append _<inum> */
> >> > > > > > > > + char suffix[CEPH_ENCRYPTED_SNAP_INO_SUFFIX_MAX];
> >> > > > > > > > +
> >> > > > > > > > + ret = snprintf(suffix, sizeof(suffix), "_%lu", dir->i_ino);
> >> > > > > > > > + total_len = prefix_len + elen + ret;
> >> > > > > > > > + if (total_len > NAME_MAX) {
> >> > > > > > > > + elen = -ENAMETOOLONG;
> >> > > > > > > > + goto out;
> >> > > > > > > > + }
> >> > > > > > > > +
> >> > > > > > > > + memcpy(p + elen, suffix, ret);
> >> > > > > > > > + /* Include the leading '_' skipped by p. */
> >> > > > > > > > + elen = total_len;
> >> > > > > > > > + }
> >> > > > > > > >
> >> > > > > > > > out:
> >> > > > > > > > kfree(cryptbuf);
> >> > > > > > >
> >> > > > > > > Looks good.
> >> > > > > > >
> >> > > > > > > Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx>
> >> > > > > > >
> >> > > > > > > Let me run xfstests for the patch to double check that everything is OK. I'll
> >> > > > > > > share the result ASAP.
> >> > > > > > >
> >> > > > > >
> >> > > > > > The xfstests run was successful. I don't see any issues with the patch.
> >> > > > > >
> >> > > > > > Tested-by: Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx>
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > > > Applied on testing branch of CephFS kernel client git tree.
> >> > > >
> >> > > > Hi Pengpeng, Slava,
> >> > > >
> >> > > > This patch raised my attention because my understanding was that the
> >> > > > entire CEPH_NOHASH_NAME_MAX + sha256() was put in place precisely to
> >> > > > handle longer names nicely and make them fit into NAME_MAX-sized buffer.
> >> > > > Simply rejecting longer names seemed to be in direct contradiction with
> >> > > > that and yet the patch on its own was clearly merited given
> >> > > >
> >> > > > * (240 bytes is the maximum size allowed for snapshot names to take into
> >> > > > * account the format: '_<SNAPSHOT-NAME>_<INODE-NUMBER>'.)
> >> > > >
> >> > > > comment on CEPH_NOHASH_NAME_MAX definition.
> >> > > >
> >> > > > I dug a bit deeper and started a discussion in [1]. The preliminary
> >> > > > conclusion is that the 240 bytes assumption was a mistake -- somehow
> >> > > > the minimum number of characters needed for <inum> ended up being used
> >> > > > instead of the maximum. CEPH_NOHASH_NAME_MAX value is likely incorrect
> >> > > > and should have been smaller -- something along the lines of 174 -
> >> > > > SHA256_DIGEST_SIZE instead of 180 - SHA256_DIGEST_SIZE.
> >> > > >
> >> > >
> >> > > The limitation could be 240 or bigger one, but anyway we need to process this
> >> > > limitation in proper way. And this patch has exactly this goal. Is 240 bytes
> >> > > limitation your concern? As far as I can see, this patch doesn't introduce this
> >> > > limitation. It was there before this modification. We can extend this limitation
> >> > > anytime.
> >> >
> >> > Hi Slava,
> >> >
> >> > My take on this is that there shouldn't be a limitation to begin with
> >> > (other than NAME_MAX which is natural and applies universally, not just
> >> > to snapshot names). This function has a bunch of code that is there
> >> > specifically to handle longer names and avoid any artificial limits:
> >> > the part of the name that spills over CEPH_NOHASH_NAME_MAX is hashed
> >> > and the whole thing is set up in such a way that the end result is
> >> > never bigger than 240 bytes. 240 isn't a random number -- it was
> >> > picked on purpose to leave room for _ prefix and _<INODE-NUMBER> suffix
> >> > (255 1 - 1 - 13) but a mistake appears to have crept in. Instead of
> >> > accounting for the maximum possible <INODE-NUMBER> length (which is 20
> >> > in decimal encoding), it accounted only for 13 (likely because it
> >> > happens to be the maximum possible length in a single-MDS setup?).
> >> >
> >> > IMO the right course of action here would be to see if the hashing
> >> > parameters can be adjusted, not introduce new ENAMETOOLONG errors.
> >> >
> >> >
> >>
> >> Hi Pengpeng,
> >>
> >> Could you please rework the patch taking into account the shared remarks?
> >
> > I would hold on until the discussion in [1] comes to conclusion. It
> > turns out that the userspace client doesn't encrypt snapshot names
> > anymore, so another (much worse, at least IMO) route would be to drop
> > this bit of functionality from the kernel client as well [2].
>
> OK, let me see if I understood this correctly:
>
> - The user-space client implementation leaks metadata (the snaphsot names)
> - The kernel client doesn't leak the snapshots names, thought there are
> bugs to be fixed (including on the MDS side)
> - The proposed solution is to drop the kernel snapshot names encryption.
>
> So, currently snapshots created on the kernel client can't be accessed
> from the user-space client, and vice-versa?

Hi Luis,

I _hope_ not -- since the inode that corresponds to the snapshot
created on the kernel client would have the fscrypt context (i.e.
fscrypt_auth_len > 0), the userspace client should be able to process
it generically.

>
> Also, won't dropping the encryption from the kernel client effectively
> make old snapshots created using a kernel client unusable?

I don't think so but it looks like these scenarios haven't been tested
when the change [1] went in. Personally, I'm not convinced that taking
away the ability to encrypt snapshot names completely (as opposed to
at least allowing snapshot names to be either encrypted or unencrypted
depending on whether the client that created the snapshot had the key)
was the right move. In fact, I would have seriously considered going
in the opposite direction of disallowing creating snapshots inside of
encrypted directories without a key the same way creating or linking in
files or directories is disallowed in that case.

I'm adding Chris, Venky and Patrick to this thread to avoid split
discussion.

[1] https://github.com/ceph/ceph/commit/73a8b2fda1976f553ec474027a3a73a5f6ceb441

Thanks,

Ilya