Re: [PATCH] btrfs: also add stripe entries for NOCOW writes

From: Qu Wenruo
Date: Mon Sep 23 2024 - 18:36:18 EST




在 2024/9/24 00:11, Johannes Thumshirn 写道:
On 23.09.24 10:54, Qu Wenruo wrote:


[...]
Finally, I do not think it's a good idea to insert RST entries for NOCOW.
If a file is set NOCOW, it means we'll doing a lot of overwrite for it.
Then why waste our time updating the RST entries again and again?

Isn't such behavior going to cause more write amplification? Meanwhile
for non-RST cases, NOCOW should cause the least amount of write
amplification.

The whole idea behind the RST was to write the RST entries _after_ the
data has been persisted to disk. Otherwise we're back at the write hole
problem. See for example this imaginary sequence:

Preallocate a range. This will then also preallocate the RST entries
with the mapping as you describe. Write to it and while you write you
have a powerloss. The copy/stripe to disk 1 is correctly written but
disk 2 didn't report back before the power loss happened.
After we have
power again, a read to disk 2 comes in, as we have a RST entry, the read
will be directed to the broken entry and garbage is returned. And this
is the good case, as we can repair it.
If it was an overwrite of a block and the same happens, we have a RST
entry pointing to a good and a bad copy.

Nope, that will not happen.

Because our metadata is still COW protected, after such powerloss, the
file extent is still showing that range is PREALLOCATED, we won't even
trigger a read.

And this is exactly the same as the non-RST PREALLOCATED write.


Once we're adding the RST entries after both writes succeed the problem
isn't there. So for preallocated extents it is even harmful to add a RST
entry.

You just forgot the metadata part, which prevents the problem from
happening in the very beginning.

Thanks,
Qu