Re: [PATCH] btrfs: ioctl: fix inaccurate determination of exclusive_operation

From: David Sterba
Date: Tue Apr 04 2023 - 15:10:51 EST


On Mon, Apr 03, 2023 at 05:37:57AM -0400, xiaoshoukui wrote:
> > Yeah I think the assertion should also check for NONE status. The paused
> > balance makes the state tracking harder but in user-started (manual or
> > scripted) commands it's typically not racing.
>
> An assertion failure means that the code may not have taken careful consideration.
> After I patched the BTRFS_EXCLOP_NONE to the assertion, regression tests shows that
> another scenario I missed.
>
> With started state == BTRFS_EXCLOP_BALANCE_PAUSED, cocurrently adding multiple devices
> to the same mount point and btrfs_exclop_balance executed finish before the latter
> thread execute assertion in btrfs_exclop_balance, exclusive_operation will changed to
> BTRFS_EXCLOP_BALANCE_PAUSED state.
>
> I also added btrfs_info before ASSERT to help troubleshooting:
> > btrfs_info(fs_info, "fs_info exclusive_operation: %d",
> > fs_info->exclusive_operation);
> > ASSERT(fs_info->exclusive_operation == BTRFS_EXCLOP_BALANCE ||
> > fs_info->exclusive_operation == BTRFS_EXCLOP_DEV_ADD ||
> > fs_info->exclusive_operation == BTRFS_EXCLOP_NONE);

> I think the assertion should also check for BTRFS_EXCLOP_BALANCE_PAUSED status.

Agreed and this looks like the complete set of the compatible
operations. I hope this does not enable some combination that should not
be allowed but the enabling side does the try lock and allows only the
paused + dev_add combination.

Please send a fix with the analysis you dit and add the relevant parts
of stack traces. The reproducer would be good to have in fstests, for
the changelog please describe the conditions that could trigger the
assertion, the reproducer itself is too long. Thanks.