Re: [PATCH v2] ceph: fix bare ceph_decode_8 OOB in decode_lockers()

From: Viacheslav Dubeyko

Date: Thu May 28 2026 - 14:11:18 EST


On Thu, 2026-05-28 at 09:25 -0400, Pavitra Jha wrote:
> decode_lockers() in cls_lock_client.c contains a bare ceph_decode_8(p)
> call after the decode_locker() loop that has no preceding bounds check.
>
> If a malicious or compromised OSD sends a cls_lock_get_info_reply where
> num_lockers is crafted such that the decode_locker() loop advances p
> exactly to end (or if num_lockers=0 and p is already at end after
> ceph_start_decoding() accepts struct_len=0), the subsequent bare
> ceph_decode_8(p) reads one byte past the validated buffer boundary.
>
> The result is passed directly into *type, which is subsequently used as
> a lock type discriminator by callers. An OSD-controlled one-byte OOB
> read at this position gives an attacker influence over the lock type
> field with no further preconditions.
>
> The safe variant ceph_decode_8_safe() already exists and is used
> consistently throughout the codebase. This site is the only remaining
> bare ceph_decode_8() in the decode_lockers() post-loop path.
>
> The goto target is err_free_lockers (not err_inval) because *lockers is
> already allocated at this point and must be freed on any decode failure.
>
> v1 of this series fixed the bare ceph_decode_32() before kzalloc_objs()
> and added the err_inval label. This v2 addresses the second bare decode
> identified by Viacheslav Dubeyko's review.
>
> Regarding the -EINVAL choice (raised in review): -EINVAL is correct for
> the err_inval path. The failure is structural malformation of OSD-supplied
> data, not a memory shortage. -ENOMEM would misrepresent the failure class
> to callers and to stable@ backporters triaging error paths.
>
> Attacker model: a malicious or compromised OSD in a multi-tenant Ceph
> deployment can trigger this against any kernel client that issues the
> lock.get_info class method (e.g. during RBD exclusive lock acquisition)
> without any further privileges beyond OSD session establishment.
>
> Fixes: d4ed4a530562 ("libceph: support for lock.lock_info")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Pavitra Jha <jhapavitra98@xxxxxxxxx>
> ---
> v2: Replace bare *type = ceph_decode_8(p) with ceph_decode_8_safe(),
> goto err_free_lockers to correctly free *lockers on failure.
> Address Viacheslav Dubeyko's review question about this site and
> clarify -EINVAL rationale.
> ---
> net/ceph/cls_lock_client.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ceph/cls_lock_client.c b/net/ceph/cls_lock_client.c
> index 4f27b3d15..c9183a348 100644
> --- a/net/ceph/cls_lock_client.c
> +++ b/net/ceph/cls_lock_client.c
> @@ -314,7 +314,7 @@ static int decode_lockers(void **p, void *end, u8 *type, char **tag,
> goto err_free_lockers;
> }
>
> - *type = ceph_decode_8(p);
> + ceph_decode_8_safe(p, end, *type, err_free_lockers);
> s = ceph_extract_encoded_string(p, end, NULL, GFP_NOIO);
> if (IS_ERR(s)) {
> ret = PTR_ERR(s);

Is it correct patch? Because, initial patch contained this:

> - *num_lockers = ceph_decode_32(p);
> + ceph_decode_32_safe(p, end, *num_lockers, err_inval);
> *lockers = kzalloc_objs(**lockers, *num_lockers, GFP_NOIO);

And I expected to see both modifications. I am slightly confused, frankly
speaking.

Thanks,
Slava.