On 08/01, Chao Yu wrote:
On 2021/7/31 6:18, Jaegeuk Kim wrote:
On 07/20, Chao Yu wrote:
On 2021/7/20 2:25, Jaegeuk Kim wrote:
On 07/19, Chao Yu wrote:
On 2021/4/27 20:37, Chao Yu wrote:
I think just reverting dirty/free bitmap is not enough if checkpoint fails,
due to we have updated sbi->cur_cp_pack and nat/sit bitmap, next CP tries
to overwrite last valid meta/node/data, then filesystem will be corrupted.
So I suggest to set cp_error if do_checkpoint() fails until we can handle
all cases, which is not so easy.
How do you think?
Let's add below patch first before you figure out the patch which covers all
things.
From 3af957c98e9e04259f8bb93ca0b74ba164f3f27e Mon Sep 17 00:00:00 2001
From: Chao Yu <chao@xxxxxxxxxx>
Date: Mon, 19 Jul 2021 16:37:44 +0800
Subject: [PATCH] f2fs: fix to stop filesystem update once CP failed
During f2fs_write_checkpoint(), once we failed in
f2fs_flush_nat_entries() or do_checkpoint(), metadata of filesystem
such as prefree bitmap, nat/sit version bitmap won't be recovered,
it may cause f2fs image to be inconsistent, let's just set CP error
flag to avoid further updates until we figure out a scheme to rollback
all metadatas in such condition.
Reported-by: Yangtao Li <frank.li@xxxxxxxx>
Signed-off-by: Yangtao Li <frank.li@xxxxxxxx>
Signed-off-by: Chao Yu <chao@xxxxxxxxxx>
---
fs/f2fs/checkpoint.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 6c208108d69c..096c85022f62 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1639,8 +1639,10 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
/* write cached NAT/SIT entries to NAT/SIT area */
err = f2fs_flush_nat_entries(sbi, cpc);
- if (err)
+ if (err) {
+ f2fs_stop_checkpoint(sbi, false);
I think we should abuse this, since we can get any known ENOMEM as well.
Yup, but one critical issue here is it can break A/B update of NAT area,
so, in order to fix this hole, how about using NOFAIL memory allocation
in f2fs_flush_nat_entries() first until we figure out the finial scheme?
NOFAIL is risky, so how about adding a retry logic on ENOMEM with a message
and then giving up if we can't get the memory? BTW, what about EIO or other
family?
How about this?
Hmm, it seems we won't get ENOMEM.
__flush_nat_entry_set
-> get_next_nat_page
-> ...
-> __get_meta_page
-> repeat on ENOMEM, but stop_checkpoint on EIO
If we have an error here, we should have stopped checkpoint. Have you seen other
issue?