Re: [PATCH 2/2] f2fs: support {d,id,did,x}node checksum

From: Chao Yu
Date: Tue Feb 13 2018 - 02:34:05 EST


Hi Jaegeuk,

On 2018/2/10 10:52, Chao Yu wrote:
> On 2018/2/10 9:41, Jaegeuk Kim wrote:
>> On 02/01, Chao Yu wrote:
>>>
>>>
>>> On 2018/2/1 6:15, Jaegeuk Kim wrote:
>>>> On 01/31, Chao Yu wrote:
>>>>> On 2018/1/31 10:02, Jaegeuk Kim wrote:
>>>>>> What if we want to add more entries in addition to node_checksum? Do we have
>>>>>> to add a new feature flag at every time? How about adding a layout value instead
>>>>>
>>>>> Hmm.. for previous implementation, IMO, we'd better add a new feature flag at
>>>>> every time, otherwise, w/ extra_nsize only, in current image, we can know a
>>>>> valid range of extended area in node block, but we don't know which
>>>>> fields/features are valid/enabled or not.
>>>>>
>>>>> One more thing is that if we can add one feature flag for each field, we got one
>>>>> more chance to disable it dynamically.
>>>>>
>>>>>> of extra_nsize? For example, layout #1 means node_checksum with extra_nsize=X?
>>>>>>
>>>>>>
>>>>>> What does 1017 mean? We need to make this structure more flexibly for new
>>>>>
>>>>> Yes, using raw 1017 is not appropriate here.
>>>>>
>>>>>> entries. Like this?
>>>>>> union {
>>>>>> struct node_v1;
>>>>>> struct node_v2;
>>>>>> struct node_v3;
>>>>>> ...
>>>>>> struct direct_node dn;
>>>>>> struct indirect_node in;
>>>>>> };
>>>>>> };
>>>>>>
>>>>>> struct node_v1 {
>>>>>> __le32 data[DEF_ADDRS_PER_BLOCK - V1_NSIZE=1];
>>>>>> __le32 node_checksum;
>>>>>> }
>>>>>>
>>>>>> struct node_v2 {
>>>>>> __le32 data[DEF_ADDRS_PER_BLOCK - V2_NSIZE=500];
>>>>>
>>>>> Hmm.. If we only need to add one more 4 bytes field in struct node_v2, but
>>>>> V2_NSIZE is defined as fixed 500, there must be 492 bytes wasted.
>>>>>
>>>>> Or we can define V2_NSIZE as 8, but if there comes more and more extended
>>>>> fields, node version count can be a large number, it results in complicated
>>>>> version recognization and handling.
>>>>>
>>>>> One more question is how can we control which fields are valid or not in
>>>>> comp[Vx_NSIZE]?
>>>>>
>>>>>
>>>>> Anyway, what I'm thinking is maybe we can restructure layout of node block like
>>>>> the one used by f2fs_inode:
>>>>>
>>>>> struct f2fs_node {
>>>>> union {
>>>>> struct f2fs_inode i;
>>>>> union {
>>>>> struct {
>>>>> __le32 node_checksum;
>>>>> __le32 feature_field_1;
>>>>> __le32 feature_field_2;
>>>>> ....
>>>>> __le32 addr[];
>>>>>
>>>>> };
>>>>> struct direct_node dn;
>>>>> struct indirect_node in;
>>>>> };
>>>>> };
>>>>> struct node_footer footer;
>>>>> } __packed;
>>>>>
>>>>> Moving all extended fields to the head of f2fs_node, so we don't have to use
>>>>> macro to indicate actual size of addr.
>>>>
>>>> Thinking what'd be the best way. My concern is, once getting more entries, we
>>>
>>> OK, I think we need more discussion.. ;)
>>>
>>>> can't set each of features individually. Like the second entry should have
>>>
>>> Oh, that will be hard. If we have to avoid that, we have to tag in somewhere
>>> e.g. f2fs_inode::i_flags2 to indicate which new field in f2fs_node is valid, for
>>> example:
>>>
>>> #define F2FS_NODE_CHECKSUM 0x0001
>>> #define F2FS_NODE_FIELD1 0x0002
>>> #define F2FS_NODE_FIELD2 0x0004
>>>
>>> union {
>>> struct {
>>> __le32 node_checksum;
>>> __le32 field_1;
>>> __le32 field_2;
>>> ....
>>> __le32 addr[];
>>> };
>>> struct direct_node dn;
>>> struct indirect_node in;
>>> };
>>>
>>> f2fs_inode::i_flags2 = F2FS_NODE_CHECKSUM | F2FS_NODE_FIELD1
>>> indicates that f2fs_node::node_checksum and f2fs_node::field_1 are valid;
>>>
>>> f2fs_inode::i_flags2 = F2FS_NODE_FIELD1 | F2FS_NODE_FIELD2
>>> indicates that f2fs_node::field_1 and f2fs_node::field_2 are valid.
>>
>> So, that's why I thought we may need a sort of each formats.
>
> Hmm.. if we have two new added fields, there are (2 << 2) combinations
> of all formats, as:
>
> struct original {
> __le32 data[DEF_ADDRS_PER_BLOCK];
> }
>
> struct node_v1 {
> __le32 data[DEF_ADDRS_PER_BLOCK - V1_NSIZE=1];
> __le32 field_1;
> }
>
> struct node_v2 {
> __le32 data[DEF_ADDRS_PER_BLOCK - V2_NSIZE=1];
> __le32 field_2;
> }
>
> struct node_v2 {
> __le32 data[DEF_ADDRS_PER_BLOCK - V3_NSIZE=2];
> __le32 field_1;
> __le32 field_2;
> }
>
> If we add more new fields, the node version will increase sharply due
> to there is (n << 2) combination with n fields. Right? Any thoughts to
> reduce maintaining overhead on those node versions structures?

Do you have time to explain more about the design of multiple version structure
for node block, I'm still be confused about two things:
1. what will we do if we want to add one new field in node structure.
2. how can we recognize which fields are valid and which ones are invalid.

Thanks,

>
> Thanks,
>
>>
>>>
>>> Any thoughts?
>>>
>>> Thanks,
>>>
>>>> enabled node_checksum, which we may not want to do.
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>> __le32 comp[V2_NSIZE];
>>>>>> }
>>>>>> ...
>>>>>>
>>>>>>> + };
>>>>>>> + struct direct_node dn;
>>>>>>> + struct indirect_node in;
>>>>>>> + };
>>>>>>> };
>>>>>>> struct node_footer footer;
>>>>>>> } __packed;
>>>>>>> --
>>>>>>> 2.15.0.55.gc2ece9dc4de6
>>>>>>
>>>>>> .
>>>>>>
>
> .
>