Re: [f2fs-dev] [PATCH] f2fs: fix to avoid data corruption by forbidding SSR overwrite
From: Jaegeuk Kim
Date: Thu Sep 26 2019 - 20:32:17 EST
On 09/26, Jaegeuk Kim wrote:
> On 09/26, Eric Biggers wrote:
> > On Fri, Aug 16, 2019 at 11:03:34AM +0800, Chao Yu wrote:
> > > There is one case can cause data corruption.
> > >
> > > - write 4k to fileA
> > > - fsync fileA, 4k data is writebacked to lbaA
> > > - write 4k to fileA
> > > - kworker flushs 4k to lbaB; dnode contain lbaB didn't be persisted yet
> > > - write 4k to fileB
> > > - kworker flush 4k to lbaA due to SSR
> > > - SPOR -> dnode with lbaA will be recovered, however lbaA contains fileB's
> > > data
> > >
> > > One solution is tracking all fsynced file's block history, and disallow
> > > SSR overwrite on newly invalidated block on that file.
> > >
> > > However, during recovery, no matter the dnode is flushed or fsynced, all
> > > previous dnodes until last fsynced one in node chain can be recovered,
> > > that means we need to record all block change in flushed dnode, which
> > > will cause heavy cost, so let's just use simple fix by forbidding SSR
> > > overwrite directly.
> > >
> > > Signed-off-by: Chao Yu <yuchao0@xxxxxxxxxx>
> > > ---
> > > fs/f2fs/segment.c | 8 +++++---
> > > 1 file changed, 5 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> > > index 9d9d9a050d59..69b3b553ee6b 100644
> > > --- a/fs/f2fs/segment.c
> > > +++ b/fs/f2fs/segment.c
> > > @@ -2205,9 +2205,11 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, block_t blkaddr, int del)
> > > if (!f2fs_test_and_set_bit(offset, se->discard_map))
> > > sbi->discard_blks--;
> > >
> > > - /* don't overwrite by SSR to keep node chain */
> > > - if (IS_NODESEG(se->type) &&
> > > - !is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> > > + /*
> > > + * SSR should never reuse block which is checkpointed
> > > + * or newly invalidated.
> > > + */
> > > + if (!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> > > if (!f2fs_test_and_set_bit(offset, se->ckpt_valid_map))
> > > se->ckpt_valid_blocks++;
> > > }
> > > --
> >
> > FYI, this commit caused xfstests generic/064 to start failing:
>
> Yup, I was looking at this.
It seems fcollapse couldn't allocate blocks sequential when rewriting blocks.
We need to adjust like this:
---
tests/generic/064 | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tests/generic/064 b/tests/generic/064
index 1ace14b6..058258d5 100755
--- a/tests/generic/064
+++ b/tests/generic/064
@@ -72,7 +72,9 @@ done
extent_after=`_count_extents $dest`
if [ $extent_before -ne $extent_after ]; then
- echo "extents mismatched before = $extent_before after = $extent_after"
+ if [ "$FSTYP" != "f2fs" ] || [ $extent_before -ne 1 ] || [ $extent_after -ne 50 ]; then
+ echo "extents mismatched before = $extent_before after = $extent_after"
+ fi
fi
# compare original file and test file.
--
2.19.0.605.g01d371f741-goog