Re: [f2fs-dev] [PATCH] f2fs: avoid frequent costly fsck triggers

From: Jaegeuk Kim
Date: Fri Nov 30 2018 - 15:28:11 EST


On 11/30, Chao Yu wrote:
> On 2018/11/30 10:35, Sheng Yong wrote:
> > Hi, Jaegeuk and Chao,
> >
> > On 2018/11/29 1:48, Jaegeuk Kim wrote:
> >> On 11/28, Chao Yu wrote:
> >>> On 2018/11/28 16:10, Jaegeuk Kim wrote:
> >>>> On 11/28, Chao Yu wrote:
> >>>>> Hi Jaeguek,
> >>>>>
> >>>>> On 2018/11/28 15:31, Jaegeuk Kim wrote:
> >>>>>> If we want to re-enable nat_bits, we rely on fsck which requires full scan
> >>>>>> of directory tree. Let's do that by regular fsck or unclean shutdown.
> >>>>>
> >>>>> Reviewed-by: Chao Yu <yuchao0@xxxxxxxxxx>
> >>>>>
> >>>>> BTW, I have patch made some month ago...
> >>>>>
> >>>>> In order to detect nat_bits disabling, could we introduce one more flag for
> >>>>> fsck?
> >>>>
> >>>> Do we have a way to enable nat_bits very quickly in fsck?
> >>>
> >>> For image with SBI_NATBIT_NEED_REPAIR flag, can we just check metadata and
> >>> rebuild nat_bits based on verified nat blocks/journals?
> >>
> >> I'm leaning to rely on full scan to enable nat_bits again. We may add a mount
> >> count or timer to trigger fsck regularly?
> >
> > I'm afraid regular full fsck would give us bad experience of booting time.
> > FYI, 256GB storage, which is filled with small files, costs almost 10 min
> > to do a full fsck. And it seems larger capacity storages are on the way.
>
> Agreed.

Agreed. So, that's why I wrote this patch.

>
> > So, is it worth doing that only to enable nat_bits (plus checking f2fs
> > consistent not that necessarily)?
>
> In android environment, I think it may be too expensive for adding nat_bits
> by triggering full scan by fsck during boot time.

That's why I'd like to enable this only when we need full scan.

>
> If we can update all nat bitmap in free time after mount, maybe we can
> rebuild nat_bits based on full nat bitmap during umount, which can be
> cheaper than rebuiding in userspace.

Yeah, rebuiling nat_bits in run time would be better, but can be applied
in future. But, since Android reboot procedure uses a timeout, if we exceed
it, we'll get unclean unmount which triggers another fsck, which doesn't
make sense at all.

>
> Thanks,
>
> >
> > Thanks
> >>
> >>>
> >>> Thanks,
> >>>
> >>>>
> >>>>>
> >>>>> >From 86e8bdb2faeec904944bb6621073f4f7de51cc2d Mon Sep 17 00:00:00 2001
> >>>>> From: Chao Yu <yuchao0@xxxxxxxxxx>
> >>>>> Date: Sun, 9 Sep 2018 05:40:20 +0800
> >>>>> Subject: [PATCH] f2fs: set specified flag after invalidating nat_bits
> >>>>>
> >>>>> Signed-off-by: Chao Yu <yuchao0@xxxxxxxxxx>
> >>>>> ---
> >>>>> fs/f2fs/checkpoint.c | 12 +++++++++++-
> >>>>> fs/f2fs/f2fs.h | 3 ++-
> >>>>> fs/f2fs/node.c | 3 +++
> >>>>> include/linux/f2fs_fs.h | 1 +
> >>>>> 4 files changed, 17 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> >>>>> index 7e17bb3dfcb1..f7fb14e0f5f9 100644
> >>>>> --- a/fs/f2fs/checkpoint.c
> >>>>> +++ b/fs/f2fs/checkpoint.c
> >>>>> @@ -1226,13 +1226,16 @@ static void update_ckpt_flags(struct f2fs_sb_info
> >>>>> *sbi, struct cp_control *cpc)
> >>>>> unsigned long orphan_num = sbi->im[ORPHAN_INO].ino_num;
> >>>>> struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
> >>>>> unsigned long flags;
> >>>>> + bool disable_natbits = false;
> >>>>>
> >>>>> spin_lock_irqsave(&sbi->cp_lock, flags);
> >>>>>
> >>>>> if ((cpc->reason & CP_UMOUNT) &&
> >>>>> le32_to_cpu(ckpt->cp_pack_total_block_count) >
> >>>>> - sbi->blocks_per_seg - NM_I(sbi)->nat_bits_blocks)
> >>>>> + sbi->blocks_per_seg - NM_I(sbi)->nat_bits_blocks) {
> >>>>> disable_nat_bits(sbi, false);
> >>>>> + disable_natbits = true;
> >>>>> + }
> >>>>>
> >>>>> if (cpc->reason & CP_TRIMMED)
> >>>>> __set_ckpt_flags(ckpt, CP_TRIMMED_FLAG);
> >>>>> @@ -1270,11 +1273,18 @@ static void update_ckpt_flags(struct f2fs_sb_info
> >>>>> *sbi, struct cp_control *cpc)
> >>>>> if (is_sbi_flag_set(sbi, SBI_QUOTA_NEED_REPAIR))
> >>>>> __set_ckpt_flags(ckpt, CP_QUOTA_NEED_FSCK_FLAG);
> >>>>>
> >>>>> + if (is_sbi_flag_set(sbi, SBI_NATBIT_NEED_REPAIR))
> >>>>> + __set_ckpt_flags(ckpt, CP_NATBIT_NEED_FSCK_FLAG);
> >>>>> +
> >>>>> /* set this flag to activate crc|cp_ver for recovery */
> >>>>> __set_ckpt_flags(ckpt, CP_CRC_RECOVERY_FLAG);
> >>>>> __clear_ckpt_flags(ckpt, CP_NOCRC_RECOVERY_FLAG);
> >>>>>
> >>>>> spin_unlock_irqrestore(&sbi->cp_lock, flags);
> >>>>> +
> >>>>> + if (disable_natbits)
> >>>>> + f2fs_msg(sbi->sb, KERN_NOTICE,
> >>>>> + "No enough space in CP area, disable nat_bits.");
> >>>>> }
> >>>>>
> >>>>> static void commit_checkpoint(struct f2fs_sb_info *sbi,
> >>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >>>>> index f0cedbe0c536..b55341c269b2 100644
> >>>>> --- a/fs/f2fs/f2fs.h
> >>>>> +++ b/fs/f2fs/f2fs.h
> >>>>> @@ -1107,6 +1107,7 @@ enum {
> >>>>> SBI_QUOTA_NEED_FLUSH, /* need to flush quota info in CP */
> >>>>> SBI_QUOTA_SKIP_FLUSH, /* skip flushing quota in current CP */
> >>>>> SBI_QUOTA_NEED_REPAIR, /* quota file may be corrupted */
> >>>>> + SBI_NATBIT_NEED_REPAIR, /* nat full/empty bitmaps need repair */
> >>>>> };
> >>>>>
> >>>>> enum {
> >>>>> @@ -1628,7 +1629,7 @@ static inline void disable_nat_bits(struct
> >>>>> f2fs_sb_info *sbi, bool lock)
> >>>>> {
> >>>>> unsigned long flags;
> >>>>>
> >>>>> - set_sbi_flag(sbi, SBI_NEED_FSCK);
> >>>>> + set_sbi_flag(sbi, SBI_NATBIT_NEED_REPAIR);
> >>>>>
> >>>>> if (lock)
> >>>>> spin_lock_irqsave(&sbi->cp_lock, flags);
> >>>>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> >>>>> index e57add1e8966..0c6f8312a087 100644
> >>>>> --- a/fs/f2fs/node.c
> >>>>> +++ b/fs/f2fs/node.c
> >>>>> @@ -2902,6 +2902,9 @@ static int __get_nat_bitmaps(struct f2fs_sb_info *sbi)
> >>>>>
> >>>>> cp_ver |= (cur_cp_crc(ckpt) << 32);
> >>>>> if (cpu_to_le64(cp_ver) != *(__le64 *)nm_i->nat_bits) {
> >>>>> + f2fs_msg(sbi->sb, KERN_NOTICE,
> >>>>> + "Disable nat_bits due to incorrect cp_ver (%llu, %llu)",
> >>>>> + cp_ver, le64_to_cpu(*(__le64 *)nm_i->nat_bits));
> >>>>> disable_nat_bits(sbi, true);
> >>>>> return 0;
> >>>>> }
> >>>>> diff --git a/include/linux/f2fs_fs.h b/include/linux/f2fs_fs.h
> >>>>> index 7196653833fa..1f3ae1504573 100644
> >>>>> --- a/include/linux/f2fs_fs.h
> >>>>> +++ b/include/linux/f2fs_fs.h
> >>>>> @@ -117,6 +117,7 @@ struct f2fs_super_block {
> >>>>> /*
> >>>>> * For checkpoint
> >>>>> */
> >>>>> +#define CP_NATBIT_NEED_FSCK_FLAG 0X00002000
> >>>>> #define CP_DISABLED_FLAG 0x00001000
> >>>>> #define CP_QUOTA_NEED_FSCK_FLAG 0x00000800
> >>>>> #define CP_LARGE_NAT_BITMAP_FLAG 0x00000400
> >>>>> --
> >>>>> 2.18.0.rc1
> >>>>>
> >>>>>
> >>>>>
> >>>>>>
> >>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@xxxxxxxxxx>
> >>>>>> ---
> >>>>>> fs/f2fs/f2fs.h | 6 +++++-
> >>>>>> 1 file changed, 5 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >>>>>> index c28a9d1cb278..aa500239baf2 100644
> >>>>>> --- a/fs/f2fs/f2fs.h
> >>>>>> +++ b/fs/f2fs/f2fs.h
> >>>>>> @@ -1621,7 +1621,11 @@ static inline void disable_nat_bits(struct f2fs_sb_info *sbi, bool lock)
> >>>>>> {
> >>>>>> unsigned long flags;
> >>>>>>
> >>>>>> - set_sbi_flag(sbi, SBI_NEED_FSCK);
> >>>>>> + /*
> >>>>>> + * In order to re-enable nat_bits we need to call fsck.f2fs by
> >>>>>> + * set_sbi_flag(sbi, SBI_NEED_FSCK). But it may give huge cost,
> >>>>>> + * so let's rely on regular fsck or unclean shutdown.
> >>>>>> + */
> >>>>>>
> >>>>>> if (lock)
> >>>>>> spin_lock_irqsave(&sbi->cp_lock, flags);
> >>>>>>
> >>>>
> >>>> .
> >>>>
> >>
> >>
> >> _______________________________________________
> >> Linux-f2fs-devel mailing list
> >> Linux-f2fs-devel@xxxxxxxxxxxxxxxxxxxxx
> >> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >>
> >> .
> >>
> >
> >
> > .
> >