Re: [PATCH v3] ceph: fix two unsafe bare decodes in decode_lockers()

From: Viacheslav Dubeyko

Date: Tue Jun 02 2026 - 12:54:12 EST


On Tue, 2026-06-02 at 00:17 -0400, Pavitra Jha wrote:
> decode_lockers() in cls_lock_client.c contains two bare decode
> operations
> that allow a malicious or compromised OSD to trigger slab-out-of-
> bounds
> reads:
>
> 1. ceph_decode_32(p) at the num_lockers field has no preceding bounds
>    check. ceph_start_decoding() accepts struct_len=0 as valid -- the
>    internal ceph_decode_need(p, end, 0, bad) always passes -- so when
> an
>    OSD sends struct_len=0, ceph_start_decoding() returns success with
>    p == end. The immediately following bare ceph_decode_32(p) then
> reads
>    4 bytes past the validated buffer boundary. The garbage value is
>    passed directly to kzalloc_objs() as the locker count.
>
>    The sibling function decode_watchers() in osd_client.c already
> uses
>    ceph_decode_32_safe() after its own ceph_start_decoding() call.
>    decode_lockers() was the only site using the bare variant.
>
> 2. ceph_decode_8(p) after the decode_locker() loop has no preceding
>    bounds check. If an OSD crafts num_lockers such that the loop
>    advances p exactly to end, the subsequent bare ceph_decode_8(p)
> reads
>    one byte past the validated buffer boundary. The result is passed
>    directly into *type, which is used as a lock type discriminator by
>    callers, giving an OSD-controlled one-byte OOB read with direct
>    influence over the lock type field.
>
> Fix both by replacing bare operations with their safe variants:
>   ceph_decode_32(p) -> ceph_decode_32_safe(p, end, *num_lockers,
>                                            err_inval)
>   ceph_decode_8(p)  -> ceph_decode_8_safe(p, end, *type,
>                                           err_free_lockers)
>
> The goto targets differ intentionally:
>   err_inval: is a new label returning -EINVAL directly. It is used
> for
>   the pre-allocation failure path where *lockers is not yet allocated
>   and must not be passed to ceph_free_lockers().
>
>   err_free_lockers: is the existing label. It is used for the
>   post-allocation failure path where *lockers is allocated and must
>   be freed.
>
> ret is set to -EINVAL before ceph_decode_8_safe() so that
> err_free_lockers returns the correct error code on bounds violation.
> Without this, err_free_lockers would return a stale ret value (0 from
> the successful decode_locker() loop), silently swallowing the error.
>
> -EINVAL is correct for both failure paths. The data received from the
> OSD is structurally malformed. -ENOMEM would misrepresent the failure
> class to callers and to stable@ backporters triaging error paths.
>
> KASAN report for bug 1 (kernel 7.0.0-rc7, QEMU/x86_64, KASLR
> disabled):
>   ==================================================================
>   BUG: KASAN: slab-out-of-bounds in ceph_oob3_init+0x251/0xff0
> [ceph_oob3_poc]
>   Read of size 4 at addr ffff88800a29b76e by task insmod/58
>
>   CPU: 0 UID: 0 PID: 58 Comm: insmod Tainted: G           O       
> 7.0.0-rc7-g9c2abf69da83-dirty #15 PREEMPT(lazy)
>   Tainted: [O]=OOT_MODULE
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-
> debian-1.17.0-1 04/01/2014
>   Call Trace:
>    <TASK>
>    dump_stack_lvl+0x4d/0x70
>    print_report+0x170/0x4f3
>    kasan_report+0xda/0x110
>    ceph_oob3_init+0x251/0xff0 [ceph_oob3_poc]
>    do_one_initcall+0x9a/0x3a0
>    do_init_module+0x27c/0x790
>    load_module+0x4a9a/0x6350
>    init_module_from_file+0x15c/0x180
>    idempotent_init_module+0x21f/0x750
>    __x64_sys_finit_module+0xba/0x120
>    do_syscall_64+0xe2/0x570
>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
>   Allocated by task 58:
>    kasan_save_stack+0x30/0x50
>    kasan_save_track+0x14/0x30
>    __kasan_kmalloc+0x7f/0x90
>    ceph_oob3_init+0x4d/0xff0 [ceph_oob3_poc]
>    do_one_initcall+0x9a/0x3a0
>    do_init_module+0x27c/0x790
>    load_module+0x4a9a/0x6350
>    init_module_from_file+0x15c/0x180
>    idempotent_init_module+0x21f/0x750
>    __x64_sys_finit_module+0xba/0x120
>    do_syscall_64+0xe2/0x570
>    entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
>   The buggy address belongs to the object at ffff88800a29a000
>    which belongs to the cache kmalloc-8k of size 8192
>   The buggy address is located 5998 bytes inside of
>    allocated 6000-byte region [ffff88800a29a000, ffff88800a29b770)
>
>   Memory state around the buggy address:
>    ffff88800a29b600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>    ffff88800a29b680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   >ffff88800a29b700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
>                                                                ^
>    ffff88800a29b780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>   ==================================================================
>
>   num_lockers=0xccccaaaa (OOB garbage from KASAN redzone)
>
> Bug 2 (ceph_decode_8) follows from the identical precondition. A
> dedicated PoC is available on request.
>
> Attacker model: a malicious or compromised OSD in a multi-tenant Ceph
> deployment can trigger this against any kernel client that issues the
> lock.get_info class method (e.g. during RBD exclusive lock
> acquisition)
> without any further privileges beyond OSD session establishment.
>
> Fixes: d4ed4a530562 ("libceph: support for lock.lock_info")
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Pavitra Jha <jhapavitra98@xxxxxxxxx>
> ---
> v3: Combine both fixes (ceph_decode_32 and ceph_decode_8) into a
> single
>     patch per Viacheslav Dubeyko's review. Set ret = -EINVAL before
>     ceph_decode_8_safe() so err_free_lockers returns the correct
> error
>     code, not stale ret (caught by Dan Carpenter / smatch). Clarify
>     err_inval vs err_free_lockers goto selection rationale and
>     -EINVAL justification.
> ---
>  net/ceph/cls_lock_client.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/net/ceph/cls_lock_client.c b/net/ceph/cls_lock_client.c
> index c6956f1df..4e6a6d3e4 100644
> --- a/net/ceph/cls_lock_client.c
> +++ b/net/ceph/cls_lock_client.c
> @@ -299,7 +299,7 @@ static int decode_lockers(void **p, void *end, u8
> *type, char **tag,
>   if (ret)
>   return ret;
>  
> - *num_lockers = ceph_decode_32(p);
> + ceph_decode_32_safe(p, end, *num_lockers, err_inval);
>   *lockers = kzalloc_objs(**lockers, *num_lockers, GFP_NOIO);
>   if (!*lockers)
>   return -ENOMEM;
> @@ -310,7 +310,8 @@ static int decode_lockers(void **p, void *end, u8
> *type, char **tag,
>   goto err_free_lockers;
>   }
>  
> - *type = ceph_decode_8(p);
> + ret = -EINVAL;
> + ceph_decode_8_safe(p, end, *type, err_free_lockers);
>   s = ceph_extract_encoded_string(p, end, NULL, GFP_NOIO);
>   if (IS_ERR(s)) {
>   ret = PTR_ERR(s);
> @@ -320,6 +321,8 @@ static int decode_lockers(void **p, void *end, u8
> *type, char **tag,
>   *tag = s;
>   return 0;
>  
> +err_inval:
> + return -EINVAL;
>  err_free_lockers:
>   ceph_free_lockers(*lockers, *num_lockers);
>   return ret;

Looks good.

Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@xxxxxxx>

Thanks,
Slava.