Re: [RFC PATCH v3 2/4] ceph: handle encrypted snapshot names in subdirectories

From: Xiubo Li
Date: Fri Mar 18 2022 - 07:28:58 EST



On 3/18/22 6:53 PM, Luís Henriques wrote:
Xiubo Li <xiubli@xxxxxxxxxx> writes:

On 3/17/22 11:45 PM, Luís Henriques wrote:
When creating a snapshot, the .snap directories for every subdirectory will
show the snapshot name in the "long format":

# mkdir .snap/my-snap
# ls my-dir/.snap/
_my-snap_1099511627782

Encrypted snapshots will need to be able to handle these snapshot names by
encrypting/decrypting only the snapshot part of the string ('my-snap').

Also, since the MDS prevents snapshot names to be bigger than 240 characters
it is necessary to adapt CEPH_NOHASH_NAME_MAX to accommodate this extra
limitation.

Signed-off-by: Luís Henriques <lhenriques@xxxxxxx>
---
fs/ceph/crypto.c | 189 ++++++++++++++++++++++++++++++++++++++++-------
fs/ceph/crypto.h | 11 ++-
2 files changed, 169 insertions(+), 31 deletions(-)

diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c
index beb73bbdd868..caa9863dee93 100644
--- a/fs/ceph/crypto.c
+++ b/fs/ceph/crypto.c
@@ -128,16 +128,100 @@ void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_se
swap(req->r_fscrypt_auth, as->fscrypt_auth);
}
-int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr
*d_name, char *buf)
+/*
+ * User-created snapshots can't start with '_'. Snapshots that start with this
+ * character are special (hint: there aren't real snapshots) and use the
+ * following format:
+ *
+ * _<SNAPSHOT-NAME>_<INODE-NUMBER>
+ *
+ * where:
+ * - <SNAPSHOT-NAME> - the real snapshot name that may need to be decrypted,
+ * - <INODE-NUMBER> - the inode number for the actual snapshot
+ *
+ * This function parses these snapshot names and returns the inode
+ * <INODE-NUMBER>. 'name_len' will also bet set with the <SNAPSHOT-NAME>
+ * length.
+ */
+static struct inode *parse_longname(const struct inode *parent, const char *name,
+ int *name_len)
{
+ struct inode *dir = NULL;
+ struct ceph_vino vino = { .snap = CEPH_NOSNAP };
+ char *inode_number;
+ char *name_end;
+ int orig_len = *name_len;
+ int ret = -EIO;
+
+ /* Skip initial '_' */
+ name++;
+ name_end = strrchr(name, '_');
+ if (!name_end) {
+ dout("Failed to parse long snapshot name: %s\n", name);
+ return ERR_PTR(-EIO);
+ }
+ *name_len = (name_end - name);
+ if (*name_len <= 0) {
+ pr_err("Failed to parse long snapshot name\n");
+ return ERR_PTR(-EIO);
+ }
+
+ /* Get the inode number */
+ inode_number = kmemdup_nul(name_end + 1,
+ orig_len - *name_len - 2,
+ GFP_KERNEL);
+ if (!inode_number)
+ return ERR_PTR(-ENOMEM);
+ ret = kstrtou64(inode_number, 0, &vino.ino);
+ if (ret) {
+ dout("Failed to parse inode number: %s\n", name);
+ dir = ERR_PTR(ret);
+ goto out;
+ }
+
+ /* And finally the inode */
+ dir = ceph_find_inode(parent->i_sb, vino);
+ if (!dir) {
+ /* This can happen if we're not mounting cephfs on the root */
+ dir = ceph_get_inode(parent->i_sb, vino, NULL);
In this case IMO you should lookup the inode from MDS instead create it in the
cache, which won't setup the encryption info needed.

So later when you try to use this to dencrypt the snapshot names, you will hit
errors ? And also the case Jeff mentioned in previous thread could happen.
No, I don't see any errors. The reason is that if we get a I_NEW inode,
we do not have the keys to even decrypt the names. If you mount a
filesystem using as root a directory that is inside an encrypted
directory, you'll see the encrypted snapshot name:

# mkdir mydir
# fscrypt encrypt mydir
# mkdir -p mydir/a/b/c/d
# mkdir mydir/a/.snap/myspan
# umount ...
# mount <mon>:<port>:/a
# ls .snap

And we simply can't decrypt it because for that we'd need to have access
to the .fscrypt in the original filesystem mount root.

Should we resolve this issue ? Something like try to copy the .fscrypt when mounting '/a' ?

-- Xiubo