Re: [PATCH v8 01/11] btrfs: add raid stripe tree definitions

From: Johannes Thumshirn
Date: Wed Sep 13 2023 - 02:02:16 EST


On 12.09.23 22:32, David Sterba wrote:
>> @@ -306,6 +306,16 @@ BTRFS_SETGET_FUNCS(timespec_nsec, struct btrfs_timespec, nsec, 32);
>> BTRFS_SETGET_STACK_FUNCS(stack_timespec_sec, struct btrfs_timespec, sec, 64);
>> BTRFS_SETGET_STACK_FUNCS(stack_timespec_nsec, struct btrfs_timespec, nsec, 32);
>>
>> +BTRFS_SETGET_FUNCS(stripe_extent_encoding, struct btrfs_stripe_extent, encoding, 8);
>
> What is encoding referring to?

At the moment (only) the RAID type. But in the future it can be expanded
to all kinds of encodings, like Reed-Solomon, Butterfly-Codes, etc...

>> static struct btrfs_lockdep_keyset {
>> u64 id; /* root objectid */
>> - /* Longest entry: btrfs-block-group-00 */
>> - char names[BTRFS_MAX_LEVEL][24];
>> + /* Longest entry: btrfs-raid-stripe-tree-00 */
>> + char names[BTRFS_MAX_LEVEL][25];
>
> Length of "btrfs-raid-stripe-tree-00" is 25, there should be +1 for the
> NUL, also length aligned to at least 4 is better.
>

OK.

>> struct lock_class_key keys[BTRFS_MAX_LEVEL];
>> } btrfs_lockdep_keysets[] = {
>> { .id = BTRFS_ROOT_TREE_OBJECTID, DEFINE_NAME("root") },
>> @@ -74,6 +74,7 @@ static struct btrfs_lockdep_keyset {
>> { .id = BTRFS_UUID_TREE_OBJECTID, DEFINE_NAME("uuid") },
>> { .id = BTRFS_FREE_SPACE_TREE_OBJECTID, DEFINE_NAME("free-space") },
>> { .id = BTRFS_BLOCK_GROUP_TREE_OBJECTID, DEFINE_NAME("block-group") },
>> + { .id = BTRFS_RAID_STRIPE_TREE_OBJECTID,DEFINE_NAME("raid-stripe-tree") },
>
> The naming is without the "tree"

OK

>> @@ -73,6 +72,9 @@
>> /* Holds the block group items for extent tree v2. */
>> #define BTRFS_BLOCK_GROUP_TREE_OBJECTID 11ULL
>>
>> +/* tracks RAID stripes in block groups. */
>
> Tracks ...
>

OK

>> +#define BTRFS_RAID_STRIPE_TREE_OBJECTID 12ULL
>> +
>> /* device stats in the device tree */
>> #define BTRFS_DEV_STATS_OBJECTID 0ULL
>>
>> @@ -285,6 +287,8 @@
>> */
>> #define BTRFS_QGROUP_RELATION_KEY 246
>>
>> +#define BTRFS_RAID_STRIPE_KEY 247
>
> Any particular reason you chose 247 for the key number? It does not
> leave any gap after BTRFS_QGROUP_RELATION_KEY and before
> BTRFS_BALANCE_ITEM_KEY. If this is related to extents then please find
> more suitable group of keys where to put it.

Nope, it was just the last free spot.

>
>> +
>> /*
>> * Obsolete name, see BTRFS_TEMPORARY_ITEM_KEY.
>> */
>> @@ -719,6 +723,31 @@ struct btrfs_free_space_header {
>> __le64 num_bitmaps;
>> } __attribute__ ((__packed__));
>>
>> +struct btrfs_raid_stride {
>> + /* btrfs device-id this raid extent lives on */
>
> Comments should be full sentences.

OK

>
>> + __le64 devid;
>> + /* physical location on disk */
>> + __le64 physical;
>> + /* length of stride on this disk */
>> + __le64 length;
>> +};
>
> __attribute__ ((__packed__));

The structure doesn't have any holes in it so packed is not needed.

I might also be misinformed, but doesn't packed potentially lead to bad
code generation on some platforms? I've always been under the
impression that packed forces the compiler to do byte-wise loads and
stores. But as I've said, I might be misinformed.

>
>> +
>> +#define BTRFS_STRIPE_DUP 0
>> +#define BTRFS_STRIPE_RAID0 1
>> +#define BTRFS_STRIPE_RAID1 2
>> +#define BTRFS_STRIPE_RAID1C3 3
>> +#define BTRFS_STRIPE_RAID1C4 4
>> +#define BTRFS_STRIPE_RAID5 5
>> +#define BTRFS_STRIPE_RAID6 6
>> +#define BTRFS_STRIPE_RAID10 7
>
> This is probably defining the on-disk format so some consistency is
> desired, there are already the BTRFS_BLOCK_GROUP_* types, from which the
> BTRFS_RAID_* are derive, so the BTRFS_STRIPE_* values should match the
> order and ideally the values themselves if possible.
>
>> +
>> +struct btrfs_stripe_extent {
>> + __u8 encoding;
>> + __u8 reserved[7];
>> + /* array of raid strides this stripe is composed of */
>> + __DECLARE_FLEX_ARRAY(struct btrfs_raid_stride, strides);
>
> Do we really whant to declare that as __DECLARE_FLEX_ARRAY? It's not a
> standard macro and obscures the definition.
>

Indeed we do not anymore, as this version does introduce another u64
before the strides array! I'll gladly get rid of it.