Re: [PATCH v2 06/14] btrfs: fix deletion of a range spanning parts two RAID stripe extents

From: Filipe Manana
Date: Thu Jan 09 2025 - 10:24:56 EST


On Tue, Jan 7, 2025 at 12:50 PM Johannes Thumshirn <jth@xxxxxxxxxx> wrote:
>
> From: Johannes Thumshirn <johannes.thumshirn@xxxxxxx>
>
> When a user requests the deletion of a range that spans multiple stripe
> extents and btrfs_search_slot() returns us the second RAID stripe extent,
> we need to pick the previous item and truncate it, if there's still a
> range to delete left, move on to the next item.
>
> The following diagram illustrates the operation:
>
> |--- RAID Stripe Extent ---||--- RAID Stripe Extent ---|
> |--- keep ---|--- drop ---|
>
> While at it, comment the trivial case of a whole item delete as well.
>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@xxxxxxx>
> ---
> fs/btrfs/raid-stripe-tree.c | 28 ++++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
> diff --git a/fs/btrfs/raid-stripe-tree.c b/fs/btrfs/raid-stripe-tree.c
> index 79f8f692aaa8f6df2c9482fbd7777c2812528f65..893d963951315abfc734e1ca232b3087b7889431 100644
> --- a/fs/btrfs/raid-stripe-tree.c
> +++ b/fs/btrfs/raid-stripe-tree.c
> @@ -103,6 +103,31 @@ int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 le
> found_end = found_start + key.offset;
> ret = 0;
>
> + /*
> + * The stripe extent starts before the range we want to delete,
> + * but the range spans more than one stripe extent:
> + *
> + * |--- RAID Stripe Extent ---||--- RAID Stripe Extent ---|
> + * |--- keep ---|--- drop ---|
> + *
> + * This means we have to get the previous item, truncate its
> + * length and then restart the search.
> + */
> + if (found_start > start) {
> +
> + ret = btrfs_previous_item(stripe_root, path, start,
> + BTRFS_RAID_STRIPE_KEY);
> + if (ret < 0)
> + break;
> + ret = 0;
> +
> + leaf = path->nodes[0];
> + slot = path->slots[0];
> + btrfs_item_key_to_cpu(leaf, &key, slot);
> + found_start = key.objectid;
> + found_end = found_start + key.offset;

Hum, this isn't safe, ignoring the case where btrfs_previous_item()
returns 1, meaning there's no previous item.

In that case previous_item() returns pointing to the same leaf and
slot, and then below we delete the item instead of trimming it
(increasing its range start and decreasing its length).

Thanks.

> + }
> +
> if (key.type != BTRFS_RAID_STRIPE_KEY)
> break;
>
> @@ -156,6 +181,9 @@ int btrfs_delete_raid_extent(struct btrfs_trans_handle *trans, u64 start, u64 le
> break;
> }
>
> + /*
> + * Finally we can delete the whole item, no more special cases.
> + */
> ret = btrfs_del_item(trans, stripe_root, path);
> if (ret)
> break;
>
> --
> 2.43.0
>
>