Re: [PATCH v2 1/2] erofs: update on-disk format for xattr name filter

From: Alexander Larsson
Date: Wed Jul 05 2023 - 04:13:19 EST


On Wed, Jul 5, 2023 at 9:51 AM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
>
>
>
> On 2023/7/5 15:43, Alexander Larsson wrote:
> > On Wed, Jul 5, 2023 at 9:25 AM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
> >>
> >>
> >>
> >> On 2023/7/5 15:04, Jingbo Xu wrote:
> >>> The xattr name bloom filter feature is going to be introduced to speed
> >>> up the negative xattr lookup, e.g. system.posix_acl_[access|default]
> >>> lookup when running "ls -lR" workload.
> >>>
> >>> The number of common used extended attributes (n) is approximately 30.
> >>
> >> There are some commonly used extended attributes (n) and the total number
> >> of these is 31:
> >>
> >>>
> >>> trusted.overlay.opaque
> >>> trusted.overlay.redirect
> >>> trusted.overlay.origin
> >>> trusted.overlay.impure
> >>> trusted.overlay.nlink
> >>> trusted.overlay.upper
> >>> trusted.overlay.metacopy
> >>> trusted.overlay.protattr
> >>> user.overlay.opaque
> >>> user.overlay.redirect
> >>> user.overlay.origin
> >>> user.overlay.impure
> >>> user.overlay.nlink
> >>> user.overlay.upper
> >>> user.overlay.metacopy
> >>> user.overlay.protattr
> >>> security.evm
> >>> security.ima
> >>> security.selinux
> >>> security.SMACK64
> >>> security.SMACK64IPIN
> >>> security.SMACK64IPOUT
> >>> security.SMACK64EXEC
> >>> security.SMACK64TRANSMUTE
> >>> security.SMACK64MMAP
> >>> security.apparmor
> >>> security.capability
> >>> system.posix_acl_access
> >>> system.posix_acl_default
> >>> user.mime_type
> >>>
> >>> Given the number of bits of the bloom filter (m) is 32, the optimal
> >>> value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74).
> >>>
> >>> The single hash function is implemented as:
> >>>
> >>> xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index)
> >>>
> >>> where index represents the index of corresponding predefined short name
> >>
> >> where `index`...
> >>
> >>
> >>
> >>> prefix, while name represents the name string after stripping the above
> >>> predefined name prefix.
> >>>
> >>> The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is
> >>> used to give a better spread when mapping these 30 extended attributes
> >>> into 32-bit bloom filter as:
> >>>
> >>> bit 0: security.ima
> >>> bit 1:
> >>> bit 2: trusted.overlay.nlink
> >>> bit 3:
> >>> bit 4: user.overlay.nlink
> >>> bit 5: trusted.overlay.upper
> >>> bit 6: user.overlay.origin
> >>> bit 7: trusted.overlay.protattr
> >>> bit 8: security.apparmor
> >>> bit 9: user.overlay.protattr
> >>> bit 10: user.overlay.opaque
> >>> bit 11: security.selinux
> >>> bit 12: security.SMACK64TRANSMUTE
> >>> bit 13: security.SMACK64
> >>> bit 14: security.SMACK64MMAP
> >>> bit 15: user.overlay.impure
> >>> bit 16: security.SMACK64IPIN
> >>> bit 17: trusted.overlay.redirect
> >>> bit 18: trusted.overlay.origin
> >>> bit 19: security.SMACK64IPOUT
> >>> bit 20: trusted.overlay.opaque
> >>> bit 21: system.posix_acl_default
> >>> bit 22:
> >>> bit 23: user.mime_type
> >>> bit 24: trusted.overlay.impure
> >>> bit 25: security.SMACK64EXEC
> >>> bit 26: user.overlay.redirect
> >>> bit 27: user.overlay.upper
> >>> bit 28: security.evm
> >>> bit 29: security.capability
> >>> bit 30: system.posix_acl_access
> >>> bit 31: trusted.overlay.metacopy, user.overlay.metacopy
> >>>
> >>> The h_name_filter field is introduced to the on-disk per-inode xattr
> >>> header to place the corresponding xattr name filter, where bit value 1
> >>> indicates non-existence for compatibility.
> >>>
> >>> This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER
> >>> compatible feature bit.
> >>>
> >>> Suggested-by: Alexander Larsson <alexl@xxxxxxxxxx>
> >>> Signed-off-by: Jingbo Xu <jefflexu@xxxxxxxxxxxxxxxxx>
> >>> ---
> >>> fs/erofs/erofs_fs.h | 8 +++++++-
> >>> 1 file changed, 7 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
> >>> index 2c7b16e340fe..b4b6235fd720 100644
> >>> --- a/fs/erofs/erofs_fs.h
> >>> +++ b/fs/erofs/erofs_fs.h
> >>> @@ -13,6 +13,7 @@
> >>>
> >>> #define EROFS_FEATURE_COMPAT_SB_CHKSUM 0x00000001
> >>> #define EROFS_FEATURE_COMPAT_MTIME 0x00000002
> >>> +#define EROFS_FEATURE_COMPAT_XATTR_FILTER 0x00000004
> >>
> >> I'd suggest that if we could leave one reserved byte in the
> >> superblock for now (and checking if it's 0) since
> >> 1) xattr filter feature is a compatible feature;
> >> 2) I'm not sure if the implementation could be changed.
> >>
> >> so that later implementation changes won't bother compat bits
> >> again.
> >
> > I would very much like to generate these bloom filters in composefs
> > right now, before the composefs v1 format is completely locked down,
> > and this should be fully possible given that this is a backwards
> > compat change. But this is only possible if it doesn't require a
> > feature flag like this that makes old erofs versions not mount the
> > image.
>
> EROFS has two types of feature bits:
>
> 1) compat flags, which doesn't block mounting on old kernels;
> 2) incompat flags, which will block mounting on old kernels.
>
> here bloom filter use a new compat flag, so old kernels will just
> ignore this and mount. compat flags just indicates that "an image
> with a feature, and you could use it or not".
>
> Here I just meant the bloom filter internals are fixed for now,
> so that we might reserve a byte in the on-disk super block for
> later potential changes (if any). And don't need to bother another
> new compat flag.

Cool. Then we're all good!

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Alexander Larsson Red Hat, Inc
alexl@xxxxxxxxxx alexander.larsson@xxxxxxxxx