Re: [PATCH] cxl/test: reject wrapped GET_LOG offsets

From: Richard Cheng

Date: Tue Jun 30 2026 - 05:42:06 EST


On Sun, Jun 28, 2026 at 11:45:29AM +0800, Samuel Moelius wrote:
> On Wed, Jun 10, 2026 at 2:01 PM Alison Schofield
> <alison.schofield@xxxxxxxxx> wrote:
> >
> > On Fri, Jun 05, 2026 at 02:20:31PM +0000, Samuel Moelius wrote:
> > > The CXL mock mailbox GET_LOG handler validates the requested CEL slice
> > > with `offset + length > sizeof(mock_cel)`. Both fields come from the
> > > userspace CXL_MEM_SEND_COMMAND payload and are 32-bit values, so an
> > > offset near U32_MAX can wrap the addition to a small value and pass the
> > > bounds check.
> > >
> > > The wrapped request then uses the original large offset as the source
> > > address for memcpy(), reading far outside the mock CEL array.
> > >
> > > Validate the offset first and compare the length against the remaining
> > > CEL size so the check cannot wrap.
> > >
> > > Assisted-by: Codex:gpt-5.5-cyber-preview
> > > Signed-off-by: Samuel Moelius <sam.moelius@xxxxxxxxxxxxxxx>
> >
> > Hi Samuel,
> >
> > I'd suggest keeping the commit log focused on the broken property and
> > how the fix restores it, rather than tracing the individual arithmetic
> > operations and later accesses, which are already evident from the code.
> >
> > The GET_LOG handler is intended to reject requests that describe a CEL
> > range extending beyond the available data. The current validation can
> > incorrectly accept some malformed requests because of arithmetic
> > wraparound, and the fix restores that property by validating the
> > requested range in a way that cannot overflow.
> >
> > The discussion of the subsequent memcpy() access leaves me wondering
> > what the observable effect actually is. Does this return bogus CEL
> > data, trigger KASAN, crash the test module, or something else? If there
> > is a demonstrated failure, please describe it. Otherwise, I think the
> > property being restored is the more important aspect to capture in the
> > commit log.
>
> A longer explanation appears below, but the bug can cause the kernel
> to panic with an out-of-bounds copy. So highlight that fact in the
> commit message?
>
> ---
>
> The bug was validated with a PoC that deliberately invalidates the CXL
> mailbox GET_LOG command to the mock CXL memory device.
>
> The vulnerable code checked the requested log slice like this:
>
> if (offset + length > sizeof(mock_cel))
> return -EINVAL;
>
> Both offset and length are 32-bit fields supplied by userspace. The PoC sets:
>
> in.offset = UINT32_MAX;
> in.length = 1;
>
> So the vulnerable expression wraps:
>
> UINT32_MAX + 1 == 0
>
> That makes the range check pass, because 0 > sizeof(mock_cel) is false.
>
> After that, the mock GET_LOG handler still uses the original huge offset:
>
> memcpy(cmd->payload_out, data + offset, length);
>
> So it tries to copy 1 byte from far past the mock CEL buffer. In QEMU
> pre-fix, that caused a kernel page fault in memcpy_orig, through
> cxl_mock_mbox_send, and then a panic because the test kernel used
> oops=panic.
>

Hi Samuel,

The substrction range check looks right to me, it also matches the existing mock
LSA checks.

I would suggest a small regression test if you have time for it.
Now in-mind only a pseudo-code version, maybe something like that following

"""
/* Use a cxl_test memdev and the valid CEL UUID/advertised CEL size. */
ASSERT_EQ(send_get_log(fd, cel_uuid, UINT32_MAX, 1, out, 1), -EINVAL);

/* Preserve valid boundaries. */
ASSERT_EQ(send_get_log(fd, cel_uuid, cel_size - 1, 1, out, 1), 0);
ASSERT_EQ(send_get_log(fd, cel_uuid, cel_size, 0, NULL, 0), 0);
"""

The first case captures the overflow regression, the latter 2 document the
intended inclusive end-of-buffer behavior.

What do you think ?

Best regards,
Richard Cheng.