Re: [PATCH v3] ceph: fix two unsafe bare decodes in decode_lockers()
From: Viacheslav Dubeyko
Date: Tue Jun 02 2026 - 12:54:12 EST
On Tue, 2026-06-02 at 00:17 -0400, Pavitra Jha wrote:
> decode_lockers() in cls_lock_client.c contains two bare decode
> operations
> that allow a malicious or compromised OSD to trigger slab-out-of-
> bounds
> reads:
>
> 1. ceph_decode_32(p) at the num_lockers field has no preceding bounds
> check. ceph_start_decoding() accepts struct_len=0 as valid -- the
> internal ceph_decode_need(p, end, 0, bad) always passes -- so when
> an
> OSD sends struct_len=0, ceph_start_decoding() returns success with
> p == end. The immediately following bare ceph_decode_32(p) then
> reads
> 4 bytes past the validated buffer boundary. The garbage value is
> passed directly to kzalloc_objs() as the locker count.
>
> The sibling function decode_watchers() in osd_client.c already
> uses
> ceph_decode_32_safe() after its own ceph_start_decoding() call.
> decode_lockers() was the only site using the bare variant.
>
> 2. ceph_decode_8(p) after the decode_locker() loop has no preceding
> bounds check. If an OSD crafts num_lockers such that the loop
> advances p exactly to end, the subsequent bare ceph_decode_8(p)
> reads
> one byte past the validated buffer boundary. The result is passed
> directly into *type, which is used as a lock type discriminator by
> callers, giving an OSD-controlled one-byte OOB read with direct
> influence over the lock type field.
>
> Fix both by replacing bare operations with their safe variants:
> ceph_decode_32(p) -> ceph_decode_32_safe(p, end, *num_lockers,
> err_inval)
> ceph_decode_8(p) -> ceph_decode_8_safe(p, end, *type,
> err_free_lockers)
>
> The goto targets differ intentionally:
> err_inval: is a new label returning -EINVAL directly. It is used
> for
> the pre-allocation failure path where *lockers is not yet allocated
> and must not be passed to ceph_free_lockers().
>
> err_free_lockers: is the existing label. It is used for the
> post-allocation failure path where *lockers is allocated and must
> be freed.
>
> ret is set to -EINVAL before ceph_decode_8_safe() so that
> err_free_lockers returns the correct error code on bounds violation.
> Without this, err_free_lockers would return a stale ret value (0 from
> the successful decode_locker() loop), silently swallowing the error.
>
> -EINVAL is correct for both failure paths. The data received from the
> OSD is structurally malformed. -ENOMEM would misrepresent the failure
> class to callers and to stable@ backporters triaging error paths.
>
> KASAN report for bug 1 (kernel 7.0.0-rc7, QEMU/x86_64, KASLR
> disabled):
> ==================================================================
> BUG: KASAN: slab-out-of-bounds in ceph_oob3_init+0x251/0xff0
> [ceph_oob3_poc]
> Read of size 4 at addr ffff88800a29b76e by task insmod/58
>
> CPU: 0 UID: 0 PID: 58 Comm: insmod Tainted: G O
> 7.0.0-rc7-g9c2abf69da83-dirty #15 PREEMPT(lazy)
> Tainted: [O]=OOT_MODULE
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-
> debian-1.17.0-1 04/01/2014
> Call Trace:
> <TASK>
> dump_stack_lvl+0x4d/0x70
> print_report+0x170/0x4f3
> kasan_report+0xda/0x110
> ceph_oob3_init+0x251/0xff0 [ceph_oob3_poc]
> do_one_initcall+0x9a/0x3a0
> do_init_module+0x27c/0x790
> load_module+0x4a9a/0x6350
> init_module_from_file+0x15c/0x180
> idempotent_init_module+0x21f/0x750
> __x64_sys_finit_module+0xba/0x120
> do_syscall_64+0xe2/0x570
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> Allocated by task 58:
> kasan_save_stack+0x30/0x50
> kasan_save_track+0x14/0x30
> __kasan_kmalloc+0x7f/0x90
> ceph_oob3_init+0x4d/0xff0 [ceph_oob3_poc]
> do_one_initcall+0x9a/0x3a0
> do_init_module+0x27c/0x790
> load_module+0x4a9a/0x6350
> init_module_from_file+0x15c/0x180
> idempotent_init_module+0x21f/0x750
> __x64_sys_finit_module+0xba/0x120
> do_syscall_64+0xe2/0x570
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> The buggy address belongs to the object at ffff88800a29a000
> which belongs to the cache kmalloc-8k of size 8192
> The buggy address is located 5998 bytes inside of
> allocated 6000-byte region [ffff88800a29a000, ffff88800a29b770)
>
> Memory state around the buggy address:
> ffff88800a29b600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ffff88800a29b680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >ffff88800a29b700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
> ^
> ffff88800a29b780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ==================================================================
>
> num_lockers=0xccccaaaa (OOB garbage from KASAN redzone)
>
> Bug 2 (ceph_decode_8) follows from the identical precondition. A
> dedicated PoC is available on request.
>
> Attacker model: a malicious or compromised OSD in a multi-tenant Ceph
> deployment can trigger this against any kernel client that issues the
> lock.get_info class method (e.g. during RBD exclusive lock
> acquisition)
> without any further privileges beyond OSD session establishment.
>
> Fixes: d4ed4a530562 ("libceph: support for lock.lock_info")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Pavitra Jha <jhapavitra98@xxxxxxxxx>
> ---
> v3: Combine both fixes (ceph_decode_32 and ceph_decode_8) into a
> single
> patch per Viacheslav Dubeyko's review. Set ret = -EINVAL before
> ceph_decode_8_safe() so err_free_lockers returns the correct
> error
> code, not stale ret (caught by Dan Carpenter / smatch). Clarify
> err_inval vs err_free_lockers goto selection rationale and
> -EINVAL justification.
> ---
> net/ceph/cls_lock_client.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/net/ceph/cls_lock_client.c b/net/ceph/cls_lock_client.c
> index c6956f1df..4e6a6d3e4 100644
> --- a/net/ceph/cls_lock_client.c
> +++ b/net/ceph/cls_lock_client.c
> @@ -299,7 +299,7 @@ static int decode_lockers(void **p, void *end, u8
> *type, char **tag,
> if (ret)
> return ret;
>
> - *num_lockers = ceph_decode_32(p);
> + ceph_decode_32_safe(p, end, *num_lockers, err_inval);
> *lockers = kzalloc_objs(**lockers, *num_lockers, GFP_NOIO);
> if (!*lockers)
> return -ENOMEM;
> @@ -310,7 +310,8 @@ static int decode_lockers(void **p, void *end, u8
> *type, char **tag,
> goto err_free_lockers;
> }
>
> - *type = ceph_decode_8(p);
> + ret = -EINVAL;
> + ceph_decode_8_safe(p, end, *type, err_free_lockers);
> s = ceph_extract_encoded_string(p, end, NULL, GFP_NOIO);
> if (IS_ERR(s)) {
> ret = PTR_ERR(s);
> @@ -320,6 +321,8 @@ static int decode_lockers(void **p, void *end, u8
> *type, char **tag,
> *tag = s;
> return 0;
>
> +err_inval:
> + return -EINVAL;
> err_free_lockers:
> ceph_free_lockers(*lockers, *num_lockers);
> return ret;
Looks good.
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx>
Thanks,
Slava.