[PATCH v3] ceph: fix two unsafe bare decodes in decode_lockers()
From: Pavitra Jha
Date: Tue Jun 02 2026 - 00:18:11 EST
decode_lockers() in cls_lock_client.c contains two bare decode operations
that allow a malicious or compromised OSD to trigger slab-out-of-bounds
reads:
1. ceph_decode_32(p) at the num_lockers field has no preceding bounds
check. ceph_start_decoding() accepts struct_len=0 as valid -- the
internal ceph_decode_need(p, end, 0, bad) always passes -- so when an
OSD sends struct_len=0, ceph_start_decoding() returns success with
p == end. The immediately following bare ceph_decode_32(p) then reads
4 bytes past the validated buffer boundary. The garbage value is
passed directly to kzalloc_objs() as the locker count.
The sibling function decode_watchers() in osd_client.c already uses
ceph_decode_32_safe() after its own ceph_start_decoding() call.
decode_lockers() was the only site using the bare variant.
2. ceph_decode_8(p) after the decode_locker() loop has no preceding
bounds check. If an OSD crafts num_lockers such that the loop
advances p exactly to end, the subsequent bare ceph_decode_8(p) reads
one byte past the validated buffer boundary. The result is passed
directly into *type, which is used as a lock type discriminator by
callers, giving an OSD-controlled one-byte OOB read with direct
influence over the lock type field.
Fix both by replacing bare operations with their safe variants:
ceph_decode_32(p) -> ceph_decode_32_safe(p, end, *num_lockers,
err_inval)
ceph_decode_8(p) -> ceph_decode_8_safe(p, end, *type,
err_free_lockers)
The goto targets differ intentionally:
err_inval: is a new label returning -EINVAL directly. It is used for
the pre-allocation failure path where *lockers is not yet allocated
and must not be passed to ceph_free_lockers().
err_free_lockers: is the existing label. It is used for the
post-allocation failure path where *lockers is allocated and must
be freed.
ret is set to -EINVAL before ceph_decode_8_safe() so that
err_free_lockers returns the correct error code on bounds violation.
Without this, err_free_lockers would return a stale ret value (0 from
the successful decode_locker() loop), silently swallowing the error.
-EINVAL is correct for both failure paths. The data received from the
OSD is structurally malformed. -ENOMEM would misrepresent the failure
class to callers and to stable@ backporters triaging error paths.
KASAN report for bug 1 (kernel 7.0.0-rc7, QEMU/x86_64, KASLR disabled):
==================================================================
BUG: KASAN: slab-out-of-bounds in ceph_oob3_init+0x251/0xff0 [ceph_oob3_poc]
Read of size 4 at addr ffff88800a29b76e by task insmod/58
CPU: 0 UID: 0 PID: 58 Comm: insmod Tainted: G O 7.0.0-rc7-g9c2abf69da83-dirty #15 PREEMPT(lazy)
Tainted: [O]=OOT_MODULE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x4d/0x70
print_report+0x170/0x4f3
kasan_report+0xda/0x110
ceph_oob3_init+0x251/0xff0 [ceph_oob3_poc]
do_one_initcall+0x9a/0x3a0
do_init_module+0x27c/0x790
load_module+0x4a9a/0x6350
init_module_from_file+0x15c/0x180
idempotent_init_module+0x21f/0x750
__x64_sys_finit_module+0xba/0x120
do_syscall_64+0xe2/0x570
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Allocated by task 58:
kasan_save_stack+0x30/0x50
kasan_save_track+0x14/0x30
__kasan_kmalloc+0x7f/0x90
ceph_oob3_init+0x4d/0xff0 [ceph_oob3_poc]
do_one_initcall+0x9a/0x3a0
do_init_module+0x27c/0x790
load_module+0x4a9a/0x6350
init_module_from_file+0x15c/0x180
idempotent_init_module+0x21f/0x750
__x64_sys_finit_module+0xba/0x120
do_syscall_64+0xe2/0x570
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The buggy address belongs to the object at ffff88800a29a000
which belongs to the cache kmalloc-8k of size 8192
The buggy address is located 5998 bytes inside of
allocated 6000-byte region [ffff88800a29a000, ffff88800a29b770)
Memory state around the buggy address:
ffff88800a29b600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88800a29b680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff88800a29b700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
^
ffff88800a29b780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
num_lockers=0xccccaaaa (OOB garbage from KASAN redzone)
Bug 2 (ceph_decode_8) follows from the identical precondition. A
dedicated PoC is available on request.
Attacker model: a malicious or compromised OSD in a multi-tenant Ceph
deployment can trigger this against any kernel client that issues the
lock.get_info class method (e.g. during RBD exclusive lock acquisition)
without any further privileges beyond OSD session establishment.
Fixes: d4ed4a530562 ("libceph: support for lock.lock_info")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Pavitra Jha <jhapavitra98@xxxxxxxxx>
---
v3: Combine both fixes (ceph_decode_32 and ceph_decode_8) into a single
patch per Viacheslav Dubeyko's review. Set ret = -EINVAL before
ceph_decode_8_safe() so err_free_lockers returns the correct error
code, not stale ret (caught by Dan Carpenter / smatch). Clarify
err_inval vs err_free_lockers goto selection rationale and
-EINVAL justification.
---
net/ceph/cls_lock_client.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/ceph/cls_lock_client.c b/net/ceph/cls_lock_client.c
index c6956f1df..4e6a6d3e4 100644
--- a/net/ceph/cls_lock_client.c
+++ b/net/ceph/cls_lock_client.c
@@ -299,7 +299,7 @@ static int decode_lockers(void **p, void *end, u8 *type, char **tag,
if (ret)
return ret;
- *num_lockers = ceph_decode_32(p);
+ ceph_decode_32_safe(p, end, *num_lockers, err_inval);
*lockers = kzalloc_objs(**lockers, *num_lockers, GFP_NOIO);
if (!*lockers)
return -ENOMEM;
@@ -310,7 +310,8 @@ static int decode_lockers(void **p, void *end, u8 *type, char **tag,
goto err_free_lockers;
}
- *type = ceph_decode_8(p);
+ ret = -EINVAL;
+ ceph_decode_8_safe(p, end, *type, err_free_lockers);
s = ceph_extract_encoded_string(p, end, NULL, GFP_NOIO);
if (IS_ERR(s)) {
ret = PTR_ERR(s);
@@ -320,6 +321,8 @@ static int decode_lockers(void **p, void *end, u8 *type, char **tag,
*tag = s;
return 0;
+err_inval:
+ return -EINVAL;
err_free_lockers:
ceph_free_lockers(*lockers, *num_lockers);
return ret;
--
2.53.0