Re: [PATCH v3] f2fs: add cur_reserved_blocks to support soft block reservation

From: Chao Yu
Date: Wed Oct 25 2017 - 11:46:50 EST


On 2017/10/25 22:06, Yunlong Song wrote:
> Hi, Chao,
> ÂÂÂ Please see my comments below.
>
> On 2017/10/25 20:26, Chao Yu wrote:
>> On 2017/10/25 18:02, Yunlong Song wrote:
>>> ping...
>> I've replied in this thread, check your email list please, or you can check the
>> comments in below link:
>>
>> https://patchwork.kernel.org/patch/9909407/
>>
>> Anyway, see comments below.
>>
>>> On 2017/8/18 23:09, Yunlong Song wrote:
>>>> This patch adds cur_reserved_blocks to extend reserved_blocks sysfs
>>>> interface to be soft threshold, which allows user configure it exceeding
>>>> current available user space. To ensure there is enough space for
>>>> supporting system's activation, this patch does not set the reserved space
>>>> to the configured reserved_blocks value at once, instead, it safely
>>>> increase cur_reserved_blocks in dev_valid_block(,node)_count to only take
>>>> up the blocks which are just obsoleted.
>>>>
>>>> Signed-off-by: Yunlong Song <yunlong.song@xxxxxxxxxx>
>>>> Signed-off-by: Chao Yu <yuchao0@xxxxxxxxxx>
>>>> ---
>>>> ÂÂ Documentation/ABI/testing/sysfs-fs-f2fs |Â 3 ++-
>>>> ÂÂ fs/f2fs/f2fs.hÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | 13 +++++++++++--
>>>> ÂÂ fs/f2fs/super.cÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |Â 3 ++-
>>>> ÂÂ fs/f2fs/sysfs.cÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | 15 +++++++++++++--
>>>> ÂÂ 4 files changed, 28 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs
>>>> index 11b7f4e..ba282ca 100644
>>>> --- a/Documentation/ABI/testing/sysfs-fs-f2fs
>>>> +++ b/Documentation/ABI/testing/sysfs-fs-f2fs
>>>> @@ -138,7 +138,8 @@ What:ÂÂÂÂÂÂÂ /sys/fs/f2fs/<disk>/reserved_blocks
>>>> ÂÂ Date:ÂÂÂÂÂÂÂ June 2017
>>>> ÂÂ Contact:ÂÂÂ "Chao Yu" <yuchao0@xxxxxxxxxx>
>>>> ÂÂ Description:
>>>> -ÂÂÂÂÂÂÂÂ Controls current reserved blocks in system.
>>>> +ÂÂÂÂÂÂÂÂ Controls current reserved blocks in system, the threshold
>>>> +ÂÂÂÂÂÂÂÂ is soft, it could exceed current available user space.
>>>> ÂÂÂÂ What:ÂÂÂÂÂÂÂ /sys/fs/f2fs/<disk>/gc_urgent
>>>> ÂÂ Date:ÂÂÂÂÂÂÂ August 2017
>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>> index 2f20b6b..84ccbdc 100644
>>>> --- a/fs/f2fs/f2fs.h
>>>> +++ b/fs/f2fs/f2fs.h
>>>> @@ -1041,6 +1041,7 @@ struct f2fs_sb_info {
>>>> ÂÂÂÂÂÂ block_t discard_blks;ÂÂÂÂÂÂÂÂÂÂÂ /* discard command candidats */
>>>> ÂÂÂÂÂÂ block_t last_valid_block_count;ÂÂÂÂÂÂÂ /* for recovery */
>>>> ÂÂÂÂÂÂ block_t reserved_blocks;ÂÂÂÂÂÂÂ /* configurable reserved blocks */
>>>> +ÂÂÂ block_t cur_reserved_blocks;ÂÂÂÂÂÂÂ /* current reserved blocks */
>>>> ÂÂÂÂÂÂÂÂ u32 s_next_generation;ÂÂÂÂÂÂÂÂÂÂÂ /* for NFS support */
>>>> ÂÂ @@ -1515,7 +1516,8 @@ static inline int inc_valid_block_count(struct f2fs_sb_info *sbi,
>>>> ÂÂÂÂÂÂÂÂ spin_lock(&sbi->stat_lock);
>>>> ÂÂÂÂÂÂ sbi->total_valid_block_count += (block_t)(*count);
>>>> -ÂÂÂ avail_user_block_count = sbi->user_block_count - sbi->reserved_blocks;
>>>> +ÂÂÂ avail_user_block_count = sbi->user_block_count -
>>>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ sbi->cur_reserved_blocks;
>>>> ÂÂÂÂÂÂ if (unlikely(sbi->total_valid_block_count > avail_user_block_count)) {
>>>> ÂÂÂÂÂÂÂÂÂÂ diff = sbi->total_valid_block_count - avail_user_block_count;
>>>> ÂÂÂÂÂÂÂÂÂÂ *count -= diff;
>>>> @@ -1549,6 +1551,10 @@ static inline void dec_valid_block_count(struct f2fs_sb_info *sbi,
>>>> ÂÂÂÂÂÂ f2fs_bug_on(sbi, sbi->total_valid_block_count < (block_t) count);
>>>> ÂÂÂÂÂÂ f2fs_bug_on(sbi, inode->i_blocks < sectors);
>>>> ÂÂÂÂÂÂ sbi->total_valid_block_count -= (block_t)count;
>>>> +ÂÂÂ if (sbi->reserved_blocks &&
>>>> +ÂÂÂÂÂÂÂ sbi->reserved_blocks != sbi->cur_reserved_blocks)
>> It's redundent check here...
> I think in most cases, cur_reserved_blocks is equal to reserved_blocks, so we do not need to calculate min any more, otherwise,
> if reserved_blocks is not 0, it will calculate min and set current_reserved_blocks each time.

OK, IMO, in some condition, we can save dirtying cache line to decrease cache
line missing with that check.

>>
>>>> +ÂÂÂÂÂÂÂ sbi->cur_reserved_blocks = min(sbi->reserved_blocks,
>>>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ sbi->cur_reserved_blocks + count);
>>>> ÂÂÂÂÂÂ spin_unlock(&sbi->stat_lock);
>>>> ÂÂÂÂÂÂ f2fs_i_blocks_write(inode, count, false, true);
>>>> ÂÂ }
>>>> @@ -1695,7 +1701,7 @@ static inline int inc_valid_node_count(struct f2fs_sb_info *sbi,
>>>> ÂÂÂÂÂÂ spin_lock(&sbi->stat_lock);
>>>> ÂÂÂÂÂÂÂÂ valid_block_count = sbi->total_valid_block_count + 1;
>>>> -ÂÂÂ if (unlikely(valid_block_count + sbi->reserved_blocks >
>>>> +ÂÂÂ if (unlikely(valid_block_count + sbi->cur_reserved_blocks >
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ sbi->user_block_count)) {
>>>> ÂÂÂÂÂÂÂÂÂÂ spin_unlock(&sbi->stat_lock);
>>>> ÂÂÂÂÂÂÂÂÂÂ goto enospc;
>>>> @@ -1738,6 +1744,9 @@ static inline void dec_valid_node_count(struct f2fs_sb_info *sbi,
>>>> ÂÂÂÂÂÂÂÂ sbi->total_valid_node_count--;
>>>> ÂÂÂÂÂÂ sbi->total_valid_block_count--;
>>>> +ÂÂÂ if (sbi->reserved_blocks &&
>>>> +ÂÂÂÂÂÂÂ sbi->reserved_blocks != sbi->cur_reserved_blocks)
>> Checking low boundary is more safe here.
> I think cur_reserved_blocks can never be larger than reserved_blocks in any case. so min(reserved_blocks,
> cur_reserved_blocks +1) is same to cur_reserved_blocks++ when reserved_blocks != cur_reserved_blocks
> (which means reserved_blocks > cur_reserved_block )

Ditto.

>>
>>>> +ÂÂÂÂÂÂÂ sbi->cur_reserved_blocks++;
>>>> ÂÂÂÂÂÂÂÂ spin_unlock(&sbi->stat_lock);
>>>> ÂÂ diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>>> index 4c1bdcb..16a805f 100644
>>>> --- a/fs/f2fs/super.c
>>>> +++ b/fs/f2fs/super.c
>>>> @@ -957,7 +957,7 @@ static int f2fs_statfs(struct dentry *dentry, struct kstatfs *buf)
>>>> ÂÂÂÂÂÂ buf->f_blocks = total_count - start_count;
>>>> ÂÂÂÂÂÂ buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;
>>>> ÂÂÂÂÂÂ buf->f_bavail = user_block_count - valid_user_blocks(sbi) -
>>>> -ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ sbi->reserved_blocks;
>>>> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ sbi->cur_reserved_blocks;
>>>> ÂÂÂÂÂÂÂÂ avail_node_count = sbi->total_node_count - F2FS_RESERVED_NODE_NUM;
>>>> ÂÂ @@ -2411,6 +2411,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ le64_to_cpu(sbi->ckpt->valid_block_count);
>>>> ÂÂÂÂÂÂ sbi->last_valid_block_count = sbi->total_valid_block_count;
>>>> ÂÂÂÂÂÂ sbi->reserved_blocks = 0;
>>>> +ÂÂÂ sbi->cur_reserved_blocks = 0;
>>>> ÂÂÂÂÂÂÂÂ for (i = 0; i < NR_INODE_TYPE; i++) {
>>>> ÂÂÂÂÂÂÂÂÂÂ INIT_LIST_HEAD(&sbi->inode_list[i]);
>>>> diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
>>>> index a1be5ac..75c37bb 100644
>>>> --- a/fs/f2fs/sysfs.c
>>>> +++ b/fs/f2fs/sysfs.c
>>>> @@ -104,12 +104,22 @@ static ssize_t features_show(struct f2fs_attr *a,
>>>> ÂÂÂÂÂÂ return len;
>>>> ÂÂ }
>>>> ÂÂ +static ssize_t f2fs_reserved_blocks_show(struct f2fs_attr *a,
>>>> +ÂÂÂÂÂÂÂ struct f2fs_sb_info *sbi, char *buf)
>>>> +{
>>>> +ÂÂÂ return snprintf(buf, PAGE_SIZE, "expected: %u\ncurrent: %u\n",
>>>> +ÂÂÂÂÂÂÂÂÂÂÂ sbi->reserved_blocks, sbi->cur_reserved_blocks);
>>>> +}
>>>> +
>>>> ÂÂ static ssize_t f2fs_sbi_show(struct f2fs_attr *a,
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ struct f2fs_sb_info *sbi, char *buf)
>>>> ÂÂ {
>>>> ÂÂÂÂÂÂ unsigned char *ptr = NULL;
>>>> ÂÂÂÂÂÂ unsigned int *ui;
>>>> ÂÂ +ÂÂÂ if (a->struct_type == RESERVED_BLOCKS)
>>>> +ÂÂÂÂÂÂÂ return f2fs_reserved_blocks_show(a, sbi, buf);
>>>> +
>>>> ÂÂÂÂÂÂ ptr = __struct_ptr(sbi, a->struct_type);
>>>> ÂÂÂÂÂÂ if (!ptr)
>>>> ÂÂÂÂÂÂÂÂÂÂ return -EINVAL;
>>>> @@ -143,12 +153,13 @@ static ssize_t f2fs_sbi_store(struct f2fs_attr *a,
>>>> ÂÂ #endif
>>>> ÂÂÂÂÂÂ if (a->struct_type == RESERVED_BLOCKS) {
>>>> ÂÂÂÂÂÂÂÂÂÂ spin_lock(&sbi->stat_lock);
>>>> -ÂÂÂÂÂÂÂ if ((unsigned long)sbi->total_valid_block_count + t >
>>>> -ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ (unsigned long)sbi->user_block_count) {
>>>> +ÂÂÂÂÂÂÂ if (t > (unsigned long)sbi->user_block_count) {
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ spin_unlock(&sbi->stat_lock);
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return -EINVAL;
>>>> ÂÂÂÂÂÂÂÂÂÂ }
>>>> ÂÂÂÂÂÂÂÂÂÂ *ui = t;
>>>> +ÂÂÂÂÂÂÂ if (t < (unsigned long)sbi->cur_reserved_blocks)
>>>> +ÂÂÂÂÂÂÂÂÂÂÂ sbi->cur_reserved_blocks = t;
>> No, for 't < cur_reserved_blocks' case, cur_reserved_blocks will out of update
>> even if there is enough free space. You know, for soft block resevation, we need
>> to reserve blocks as many as possible, making free space being zero suddenly is
>> possible.
> I do not understand why it is not safe to decrease cur_reserved_blocks, for example,
> if current cur_reserved_blocks is 100, now decrease it to 80, is there any problem?
> If 80 will make free space zero, how does 100 exist?
> And I do not think it is safe as following:
> ÂÂÂÂÂÂÂÂ *ui = t;
> +ÂÂÂÂÂÂÂ sbi->current_reserved_blocks = min(sbi->reserved_blocks,
> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ sbi->user_block_count - valid_user_blocks(sbi));
>
> If user_block_count = 200, valid_user_blocks=150ï reserved_blocks = 100,
> then current_reserved_block = min(100,200-50) = 50, in this case, free space
> is suddenly becoming zero.
Free space becomes zero suddenly is safe, as I said before, I don't expect this
feature can be used in android, instead, it may be used in distributed storage
scenario, in where, once we configure soft_reserved_block making one server out
of free space, it's not critical issue to this system since we can move current
copy to another server which has enough free space.

Secondly, as an global configuration, it's due to usage of administrator with
it, if there is critical application which is sensitive with free space,
administrator should make sure our reservation should not overload consuming free
space, which means soft reservation is not suitable.

> To avoid this, I change the code to:
>
> +ÂÂÂÂÂÂÂ if (t < (unsigned long)sbi->cur_reserved_blocks)
> +ÂÂÂÂÂÂÂÂÂÂÂ sbi->cur_reserved_blocks = t;
>
> I think it is only safe to decrease the value of cur_reserved_blocks, and leave increase operation to
> dec_valid_block(,node)_count, it is safe to increase cur_reserved_blocks there.

For initialization of reserved_blocks, cur_reserved_blocks will always be zero
due to this check, and will be updated to close to reserved_blocks after block
allocation and deallocation of user, IMO, it's not looks reasonable to user.

Anyway, it's due to how you define semantics of soft reservation, so what is your
understanding of it?

Thanks,

>>
>> Thanks,
>>
>>>> ÂÂÂÂÂÂÂÂÂÂ spin_unlock(&sbi->stat_lock);
>>>> ÂÂÂÂÂÂÂÂÂÂ return count;
>>>> ÂÂÂÂÂÂ }
>> .
>>
>