RE: [PATCH v4] libceph, ceph: reject oversized mon and MDS data segments

From: Viacheslav Dubeyko

Date: Tue May 05 2026 - 14:52:21 EST


On Tue, 2026-05-05 at 12:24 +0400, Dhiraj Mishra wrote:


>
> Hi Slava,
>
> You are correct that the new check at the tail of mon_alloc_msg() is not
> for the get_generic_reply() cases.
>
> For CEPH_MSG_STATFS_REPLY and CEPH_MSG_MON_COMMAND_ACK we return early to
> get_generic_reply(), so those generic replies are covered by the separate
> data_len check there.
>
> The purpose of the mon_alloc_msg() check is the other monitor-reply path
> that stays in mon_alloc_msg(). In particular, the concrete trigger I
> tested is CEPH_MSG_MON_MAP: it does not use get_generic_reply(), it is
> allocated directly with ceph_msg_new(type, front_len, GFP_NOFS, false),
> and therefore can have msg->data_length == 0 while the wire header
> advertises a non-zero data_len.
>
> That is also why I kept the data_len validation after this existing
> logic:
>
>         if (!m) {
>                 ...
>         } else if (front_len > m->front_alloc_len) {
>                 ...
>                 ceph_msg_put(m);
>                 m = ceph_msg_new(type, front_len, GFP_NOFS, false);
>         }
>
> The front_len branch may replace m with a newly allocated message, so the
> data_len check needs to run on that final m as well. If it were folded
> into the same else-if chain, the replacement m would not be validated.
>
> So the split is intentional:
>
> - get_generic_reply():
>   generic replies backed by req->reply
> - mon_alloc_msg() tail check:
>   non-generic monitor replies allocated directly in mon_alloc_msg()
>   (including the tested CEPH_MSG_MON_MAP case)
>
> I agree this separation was not obvious enough from the previous diff.
> If you prefer, I can respin once more and move the validation into the
> common receive-side path so it is checked in one place.
>

I think that likewise code flow makes more sense:

if (!m) {
pr_info("alloc_msg unknown type %d\n", type);
*skip = 1;
return m;
}

if (front_len > m->front_alloc_len) {
pr_warn("mon_alloc_msg front %d > prealloc %d (%u#%llu)\n",
front_len, m->front_alloc_len,
(unsigned int)con->peer_name.type,
le64_to_cpu(con->peer_name.num));
ceph_msg_put(m);
m = ceph_msg_new(type, front_len, GFP_NOFS, false);
}

if (data_len > m->data_length) {
pr_warn_ratelimited("mon message data %u > prealloc %zu
(%u#%llu), skipping\n",
data_len, m->data_length,
(unsigned int)con->peer_name.type,
le64_to_cpu(con->peer_name.num));
ceph_msg_put(m);
m = NULL;
*skip = 1;
}

Is it possible that we could have both issues? Why do we behave differently in
both cases?

Thanks,
Slava.