Re: [PATCH v2 2/8] unicode: Create utf8_check_strict_name

From: Gabriel Krisman Bertazi
Date: Tue Sep 03 2024 - 11:34:49 EST


André Almeida <andrealmeid@xxxxxxxxxx> writes:

> Create a helper function for filesystems do the checks required for
> casefold directories and strict enconding.
>
> Suggested-by: Gabriel Krisman Bertazi <gabriel@xxxxxxxxxx>
> Signed-off-by: André Almeida <andrealmeid@xxxxxxxxxx>
> ---
> fs/unicode/utf8-core.c | 26 ++++++++++++++++++++++++++
> include/linux/unicode.h | 2 ++
> 2 files changed, 28 insertions(+)
>
> diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c
> index 0400824ef493..4966e175ed71 100644
> --- a/fs/unicode/utf8-core.c
> +++ b/fs/unicode/utf8-core.c

I don't think this belongs in fs/unicode. it is filesystem semantics whether
they don't allow invalid utf8 names and, while fs/unicode provides
utf8_validate to verify if a string is valid, it has no business looking
into superblock and inode flags.

It would be better placed as a libfs helper.

> @@ -214,3 +214,29 @@ void utf8_unload(struct unicode_map *um)
> }
> EXPORT_SYMBOL(utf8_unload);
>
> +/**
> + * utf8_check_strict_name - Check if a given name is suitable for a directory

To follow the namespace in libfs, we could call it

generic_ci_validate_strict_name

> + *
> + * This functions checks if the proposed filename is suitable for the parent

suitable => valid

> + * directory. That means that only valid UTF-8 filenames will be accepted for
> + * casefold directories from filesystems created with the strict enconding flags.

enconding flags => encoding flag

> + * That also means that any name will be accepted for directories that doesn't
> + * have casefold enabled, or aren't being strict with the enconding.

encoding

> + *
> + * @inode: inode of the directory where the new file will be created
> + * @d_name: name of the new file

d_name means 'dentry name'. just 'name' is enough here since it doesn't
matter if the qstr is coming from the dentry.

> + *
> + * Returns:
> + * * True if the filename is suitable for this directory. It can be true if a
> + * given name is not suitable for a strict enconding directory, but the
> + * directory being used isn't strict
> + * * False if the filename isn't suitable for this directory. This only happens
> + * when a directory is casefolded and is strict about its encoding.
> + */
> +bool utf8_check_strict_name(struct inode *dir, struct qstr *d_name)
> +{
> + return !(IS_CASEFOLDED(dir) && dir->i_sb->s_encoding &&
> + sb_has_strict_encoding(dir->i_sb) &&
> + utf8_validate(dir->i_sb->s_encoding, d_name));
> +}

Now that it is a helper, it could now be unfolded to something more
readable:

if (!IS_CASEFOLDED(dir) || !sb_has_strict_encoding(dir->i_sb)))
return true;

/* Should never happen. Unless the filesystem is corrupt. */
if (WARN_ON_ONCE(!dir->i_sb->s_encoding))
return true;

return utf8_validate(...)

> +EXPORT_SYMBOL(utf8_check_strict_name);
> diff --git a/include/linux/unicode.h b/include/linux/unicode.h
> index 4d39e6e11a95..fb56fb5e686c 100644
> --- a/include/linux/unicode.h
> +++ b/include/linux/unicode.h
> @@ -76,4 +76,6 @@ int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
> struct unicode_map *utf8_load(unsigned int version);
> void utf8_unload(struct unicode_map *um);
>
> +bool utf8_check_strict_name(struct inode *dir, struct qstr *d_name);
> +
> #endif /* _LINUX_UNICODE_H */

--
Gabriel Krisman Bertazi