Re: [PATCH v5 4/4] fs: unicode: Add utf8 module and a unicode layer

From: Eric Biggers
Date: Mon Mar 29 2021 - 22:02:20 EST


On Tue, Mar 30, 2021 at 02:12:40AM +0530, Shreeya Patel wrote:
> diff --git a/fs/unicode/Kconfig b/fs/unicode/Kconfig
> index 2c27b9a5cd6c..ad4b837f2eb2 100644
> --- a/fs/unicode/Kconfig
> +++ b/fs/unicode/Kconfig
> @@ -2,13 +2,26 @@
> #
> # UTF-8 normalization
> #
> +# CONFIG_UNICODE will be automatically enabled if CONFIG_UNICODE_UTF8
> +# is enabled. This config option adds the unicode subsystem layer which loads
> +# the UTF-8 module whenever any filesystem needs it.
> config UNICODE
> - bool "UTF-8 normalization and casefolding support"
> + bool
> +
> +# utf8data.h_shipped has a large database table which is an auto-generated
> +# decodification trie for the unicode normalization functions and it is not
> +# necessary to carry this large table in the kernel.
> +# Enabling UNICODE_UTF8 option will allow UTF-8 encoding to be built as a
> +# module and this module will be loaded by the unicode subsystem layer only
> +# when any filesystem needs it.
> +config UNICODE_UTF8
> + tristate "UTF-8 module"
> help
> Say Y here to enable UTF-8 NFD normalization and NFD+CF casefolding
> support.
> + select UNICODE

This seems problematic; it allows users to set CONFIG_EXT4_FS=y (or
CONFIG_F2FS_FS=y) but then CONFIG_UNICODE_UTF8=m. Then the filesystem won't
work if the modules are located on the filesystem itself.

I think it should work analogously to CONFIG_FS_ENCRYPTION and
CONFIG_FS_ENCRYPTION_ALGS. That is, CONFIG_UNICODE should be a user-selectable
bool, and then the tristate symbols CONFIG_EXT4_FS and CONFIG_F2FS_FS should
select the tristate symbol CONFIG_UNICODE_UTF8 if CONFIG_UNICODE.

- Eric