Re: [PATCH 4/8] vfs: Fold casefolding into vfs

From: Theodore Y. Ts'o
Date: Fri Jan 03 2020 - 15:26:48 EST


On Mon, Dec 02, 2019 at 09:10:45PM -0800, Daniel Rosenberg wrote:
> @@ -228,6 +229,13 @@ static inline int dentry_string_cmp(const unsigned char *cs, const unsigned char
>
> #endif
>
> +bool needs_casefold(const struct inode *dir)
> +{
> + return IS_CASEFOLDED(dir) &&
> + (!IS_ENCRYPTED(dir) || fscrypt_has_encryption_key(dir));
> +}
> +EXPORT_SYMBOL(needs_casefold);
> +

I'd suggest adding a check to make sure that dir->i_sb->s_encoding is
non-NULL before needs_casefold() returns non-NULL. Otherwise a bug
(or a fuzzed file system) which manages to set the S_CASEFOLD flag without having s_encoding be initialized might cause a NULL dereference.

Also, maybe make needs_casefold() an inline function which returns 0
if CONFIG_UNICODE is not defined? That way the need for #ifdef
CONFIG_UNICODE could be reduced.

> @@ -247,7 +255,19 @@ static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c
> * be no NUL in the ct/tcount data)
> */
> const unsigned char *cs = READ_ONCE(dentry->d_name.name);
> +#ifdef CONFIG_UNICODE
> + struct inode *parent = dentry->d_parent->d_inode;
>
> + if (unlikely(needs_casefold(parent))) {
> + const struct qstr n1 = QSTR_INIT(cs, tcount);
> + const struct qstr n2 = QSTR_INIT(ct, tcount);
> + int result = utf8_strncasecmp(dentry->d_sb->s_encoding,
> + &n1, &n2);
> +
> + if (result >= 0 || sb_has_enc_strict_mode(dentry->d_sb))
> + return result;
> + }
> +#endif

This is an example of how we could drop the #ifdef CONFIG_UNICODE
(moving the declaration of 'parent' into the #if statement) if
needs_casefold() always returns 0 if !defined(CONFIG_UNICODE).

> @@ -2404,7 +2424,22 @@ struct dentry *d_hash_and_lookup(struct dentry *dir, struct qstr *name)
> * calculate the standard hash first, as the d_op->d_hash()
> * routine may choose to leave the hash value unchanged.
> */
> +#ifdef CONFIG_UNICODE
> + unsigned char *hname = NULL;
> + int hlen = name->len;
> +
> + if (IS_CASEFOLDED(dir->d_inode)) {
> + hname = kmalloc(PATH_MAX, GFP_ATOMIC);
> + if (!hname)
> + return ERR_PTR(-ENOMEM);
> + hlen = utf8_casefold(dir->d_sb->s_encoding,
> + name, hname, PATH_MAX);
> + }
> + name->hash = full_name_hash(dir, hname ?: name->name, hlen);
> + kfree(hname);
> +#else
> name->hash = full_name_hash(dir, name->name, name->len);
> +#endif

Perhaps this could be refactored out? It's also used in
link_path_walk() and lookup_one_len_common().

(Note, there was some strageness in lookup_one_len_common(), where
hname is freed twice, the first time using kvfree() which I don't
believe is needed. But this can be fixed as part of the refactoring.)

- Ted