Re: [RFC PATCH v3] ceph: prevent a client from exceeding the MDS maximum xattr size

From: Xiubo Li
Date: Thu Jun 02 2022 - 06:57:47 EST



On 6/2/22 6:28 PM, Luís Henriques wrote:
Xiubo Li <xiubli@xxxxxxxxxx> writes:

On 6/2/22 5:26 PM, Luís Henriques wrote:
Xiubo Li <xiubli@xxxxxxxxxx> writes:

On 6/2/22 12:29 AM, Luís Henriques wrote:
The MDS tries to enforce a limit on the total key/values in extended
attributes. However, this limit is enforced only if doing a synchronous
operation (MDS_OP_SETXATTR) -- if we're buffering the xattrs, the MDS
doesn't have a chance to enforce these limits.

This patch adds support for decoding the xattrs maximum size setting that is
distributed in the mdsmap. Then, when setting an xattr, the kernel client
will revert to do a synchronous operation if that maximum size is exceeded.

While there, fix a dout() that would trigger a printk warning:

[ 98.718078] ------------[ cut here ]------------
[ 98.719012] precision 65536 too large
[ 98.719039] WARNING: CPU: 1 PID: 3755 at lib/vsprintf.c:2703 vsnprintf+0x5e3/0x600
...

URL: https://tracker.ceph.com/issues/55725
Signed-off-by: Luís Henriques <lhenriques@xxxxxxx>
---
fs/ceph/mdsmap.c | 27 +++++++++++++++++++++++----
fs/ceph/xattr.c | 12 ++++++++----
include/linux/ceph/mdsmap.h | 1 +
3 files changed, 32 insertions(+), 8 deletions(-)

* Changes since v2

Well, a lot has changed since v2! Now the xattr max value setting is
obtained through the mdsmap, which needs to be decoded, and the feature
that was used in the previous revision was dropped. The drawback is that
the MDS isn't unable to know in advance if a client is aware of this xattr
max value.

* Changes since v1

Added support for new feature bit to get the MDS max_xattr_pairs_size
setting.

Also note that this patch relies on a patch that hasn't been merged yet
("ceph: use correct index when encoding client supported features"),
otherwise the new feature bit won't be correctly encoded.

diff --git a/fs/ceph/mdsmap.c b/fs/ceph/mdsmap.c
index 30387733765d..36b2bc18ca2a 100644
--- a/fs/ceph/mdsmap.c
+++ b/fs/ceph/mdsmap.c
@@ -13,6 +13,12 @@
#include "super.h"
+/*
+ * Maximum size of xattrs the MDS can handle per inode by default. This
+ * includes the attribute name and 4+4 bytes for the key/value sizes.
+ */
+#define MDS_MAX_XATTR_SIZE (1<<16) /* 64K */
+
#define CEPH_MDS_IS_READY(i, ignore_laggy) \
(m->m_info[i].state > 0 && ignore_laggy ? true : !m->m_info[i].laggy)
@@ -352,12 +358,10 @@ struct ceph_mdsmap *ceph_mdsmap_decode(void **p, void
*end, bool msgr2)
__decode_and_drop_type(p, end, u8, bad_ext);
}
if (mdsmap_ev >= 8) {
- u32 name_len;
/* enabled */
ceph_decode_8_safe(p, end, m->m_enabled, bad_ext);
- ceph_decode_32_safe(p, end, name_len, bad_ext);
- ceph_decode_need(p, end, name_len, bad_ext);
- *p += name_len;
+ /* fs_name */
+ ceph_decode_skip_string(p, end, bad_ext);
}
/* damaged */
if (mdsmap_ev >= 9) {
@@ -370,6 +374,21 @@ struct ceph_mdsmap *ceph_mdsmap_decode(void **p, void *end, bool msgr2)
} else {
m->m_damaged = false;
}
+ if (mdsmap_ev >= 17) {
+ /* balancer */
+ ceph_decode_skip_string(p, end, bad_ext);
+ /* standby_count_wanted */
+ ceph_decode_skip_32(p, end, bad_ext);
+ /* old_max_mds */
+ ceph_decode_skip_32(p, end, bad_ext);
+ /* min_compat_client */
+ ceph_decode_skip_8(p, end, bad_ext);
This is incorrect.

If mdsmap_ev == 15 the min_compat_client will be a feature_bitset_t instead of
int8_t.
Hmm... can you point me at where that's done in the code? As usual, I'm
confused with that code and simply can't see that.

Also, if that happens only when mdsmap_ev == 15, then there's no problem
because that branch is only taken if it's >= 17.
Yeah, so you should skip 32 or 32+64 bits instead here, just likes:

3536                 /* version >= 3, feature bits */
3537                 ceph_decode_32_safe(&p, end, len, bad);
3538                 if (len) {
3539                         ceph_decode_64_safe(&p, end, features, bad);
3540                         p += len - sizeof(features);
3541                 }

For the ceph code please see:

Please see https://github.com/ceph/ceph/blob/main/src/mds/MDSMap.cc#L925.
I still don't see what your saying. From what I understand, with <= 15 we
used to have 'min_compat_client', which is of type 'ceph_release_t',
defined in src/common/ceph_releases.h:

enum class ceph_release_t : std::uint8_t {
...
}

Okay, you are right.

I miss reading that code.

-- Xiubo


Then, starting with >= 16 the MDS ignores this 'min_compat_client' field
(but still encodes/decodes it), and it *adds* 'required_client_features',
which is a 'feature_bitset_t' and that is decoded immediately after (see
bellow, the ceph_decode_skip_set() call).

Cheers,