Re: [PATCH v1] binfmt_misc: fix crash when load/unload module

From: Christian Brauner
Date: Mon Jan 24 2022 - 05:40:33 EST


On Sun, Jan 23, 2022 at 04:33:41PM -0800, Tong Zhang wrote:
> We should unregister the table upon module unload otherwise something
> horrible will happen when we load binfmt_misc module again. Also note
> that we should keep value returned by register_sysctl_mount_point() and
> release it later, otherwise it will leak.
>
> reproduce:
> modprobe binfmt_misc
> modprobe -r binfmt_misc
> modprobe binfmt_misc
> modprobe -r binfmt_misc
> modprobe binfmt_misc
>
> [ 18.032038] Call Trace:
> [ 18.032108] <TASK>
> [ 18.032169] dump_stack_lvl+0x34/0x44
> [ 18.032273] __register_sysctl_table+0x6f4/0x720
> [ 18.032397] ? preempt_count_sub+0xf/0xb0
> [ 18.032508] ? 0xffffffffc0040000
> [ 18.032600] init_misc_binfmt+0x2d/0x1000 [binfmt_misc]
> [ 18.042520] binfmt_misc: Failed to create fs/binfmt_misc sysctl mount point
> modprobe: can't load module binfmt_misc (kernel/fs/binfmt_misc.ko): Cannot allocate memory
> [ 18.063549] binfmt_misc: Failed to create fs/binfmt_misc sysctl mount point
> [ 18.204779] BUG: unable to handle page fault for address: fffffbfff8004802
>
> Fixes: 3ba442d5331f ("fs: move binfmt_misc sysctl to its own file")
> Signed-off-by: Tong Zhang <ztong0001@xxxxxxxxx>
> ---
> fs/binfmt_misc.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
> index ddea6acbddde..614aedb8ab2e 100644
> --- a/fs/binfmt_misc.c
> +++ b/fs/binfmt_misc.c
> @@ -817,12 +817,16 @@ static struct file_system_type bm_fs_type = {
> };
> MODULE_ALIAS_FS("binfmt_misc");
>
> +static struct ctl_table_header *binfmt_misc_header;
> +
> static int __init init_misc_binfmt(void)
> {
> int err = register_filesystem(&bm_fs_type);
> if (!err)
> insert_binfmt(&misc_format);
> - if (!register_sysctl_mount_point("fs/binfmt_misc")) {
> +
> + binfmt_misc_header = register_sysctl_mount_point("fs/binfmt_misc");
> + if (!binfmt_misc_header) {

The fix itself is obviously needed.

However, afaict the previous patch introduced another bug and this patch
right here doesn't fix it either.

Namely, if you set CONFIG_SYSCTL=n and CONFIG_BINFMT_MISC={y,m}, then
register_sysctl_mount_point() will return NULL causing modprobe
binfmt_misc to fail. However, before 3ba442d5331f ("fs: move binfmt_misc
sysctl to its own file") loading binfmt_misc would've succeeded even if
fs/binfmt_misc wasn't created in kernel/sysctl.c. Afaict, that goes for
both CONFIG_SYSCTL={y,n} since even in the CONFIG_SYSCTL=y case the
kernel would've moved on if creating the sysctl header would've failed.
And that makes sense since binfmt_misc is mountable wherever, not just
at fs/binfmt_misc.

All that indicates that the correct fix here would be to simply:

binfmt_misc_header = register_sysctl_mount_point("fs/binfmt_misc");

without checking for an error. That should fully restore the old
behavior.

> pr_warn("Failed to create fs/binfmt_misc sysctl mount point");
> return -ENOMEM;
> }
> @@ -831,6 +835,7 @@ static int __init init_misc_binfmt(void)
>
> static void __exit exit_misc_binfmt(void)
> {
> + unregister_sysctl_table(binfmt_misc_header);
> unregister_binfmt(&misc_format);
> unregister_filesystem(&bm_fs_type);
> }
> --
> 2.25.1
>