Re: [6.2][regression] after commit 947a629988f191807d2d22ba63ae18259bb645c5 btrfs volume periodical forced switch to readonly after a lot of disk writes

From: Qu Wenruo
Date: Wed Dec 28 2022 - 22:06:35 EST




On 2022/12/29 08:08, Mikhail Gavrilov wrote:
On Thu, Dec 29, 2022 at 4:31 AM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:

Are you using qgroup? If so it may be worthy trying disabling qgroup.

I do not use quota.
And looks like my distro does not use quita by default.
❯ btrfs qgroup show -f /
ERROR: can't list qgroups: quotas not enabled

But for newer kernel, qgroup hang should only happen when dropping large
snapshot, I don't know if podman pull would cause older snapshots to be
deleted...

It is not a regression, it also happened on older kernels.
But it is really annoying when the browser process waits when "podman
pull" writes changes to disk.
In fact, I have been waiting for 5 years for caching of slow HDDs by
using the cache on the SSD, but apparently I can’t wait.
And I started slowly buying expensive large SSDs to replace the big
HDD. I still can’t find time to connect D5 P5316 30.72 Tb to the
primary workstation.
I want to make a video review of it. I understand this is an expensive
solution and not suitable for everyone, unlike an affordable SDD
cache.

This is really sad to hear that.

For now, I only have several guesses on how this could happen.

- Extra seeks between metadata and data chunks
You can try with mixed block groups, but this needs mkfs time tuning.

- Extra inlined extents causing too much metadata overhead
You can disable inline extents using max_inline=0 as mount options.
But that only affects newly created files, not the existing ones.

Otherwise, using bcache may be a solution.

For now I'm not aware of any HDD specific tests, other than zoned devices, thus the performance problem can be really hard to debug.

Thanks,
Qu