Re: [PATCH v4 2/3] btrfs: Split remaining space to discard in chunks

From: Luca Stefani
Date: Mon Sep 16 2024 - 06:51:28 EST




On 16/09/24 12:39, Qu Wenruo wrote:


在 2024/9/16 19:46, Luca Stefani 写道:
Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device,
mostly empty although we will do the split according to our super block
locations, the last super block ends at 256G, we can submit a huge
discard for the range [256G, 8T), causing a super large delay.

We now split the space left to discard based on BTRFS_MAX_DATA_CHUNK_SIZE
in preparation of introduction of cancellation signals handling.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
Signed-off-by: Luca Stefani <luca.stefani.ge1@xxxxxxxxx>
---
  fs/btrfs/extent-tree.c | 24 +++++++++++++++++++-----
  1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index a5966324607d..cbe66d0acff8 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1239,7 +1239,7 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
                     u64 *discarded_bytes)
  {
      int j, ret = 0;
-    u64 bytes_left, end;
+    u64 bytes_left, bytes_to_discard, end;
      u64 aligned_start = ALIGN(start, 1 << SECTOR_SHIFT);
      /* Adjust the range to be aligned to 512B sectors if necessary. */
@@ -1300,13 +1300,27 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
          bytes_left = end - start;
      }
-    if (bytes_left) {
+    while (bytes_left) {
+        if (bytes_left > BTRFS_MAX_DATA_CHUNK_SIZE)
+            bytes_to_discard = BTRFS_MAX_DATA_CHUNK_SIZE;

That MAX_DATA_CHUNK_SIZE is only possible for RAID0/RAID10/RAID5/RAID6, by spanning the device extents across multiple devices.

For each device, the maximum size is limited to 1G (check init_alloc_chunk_ctl_policy_regular()).

So you can just limit it to 1G instead.
(If you want, you can also extract that into a macro as a cleanup).
I think SZ_1G is enough for now.

Furthermore, you can use min() instead of a if ().

So you only need:

        bytes_to_discard = min(SZ_1G, bytes_left);

Otherwise this looks good enough to me.
If the 1G size is not good enough, we can later tune it to smaller values.

Personally speaking I think 1G would be enough.

Thanks,
Qu
Ack, done in v5
+        else
+            bytes_to_discard = bytes_left;
+
          ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT,
-                       bytes_left >> SECTOR_SHIFT,
+                       bytes_to_discard >> SECTOR_SHIFT,
                         GFP_NOFS);
-        if (!ret)
-            *discarded_bytes += bytes_left;
+
+        if (ret) {
+            if (ret != -EOPNOTSUPP)
+                break;
+            continue;
+        }
+
+        start += bytes_to_discard;
+        bytes_left -= bytes_to_discard;
+        *discarded_bytes += bytes_to_discard;
      }
+
      return ret;
  }