Re: [PATCH] cxl/acpi: Verify CHBS length for CXL2.0
From: Jonathan Cameron
Date: Fri Apr 04 2025 - 09:55:30 EST
On Fri, 28 Mar 2025 04:15:13 +0000
"Zhijian Li (Fujitsu)" <lizhijian@xxxxxxxxxxx> wrote:
> On 27/03/2025 21:36, Dan Williams wrote:
> > Zhijian Li (Fujitsu) wrote:
> >>
> >>
> >> On 27/03/2025 11:44, Ira Weiny wrote:
> >>> Li Zhijian wrote:
> >>>> Per CXL Spec r3.1 Table 9-21, both CXL1.1 and CXL2.0 have defined their
> >>>> own length, verify it to avoid an invalid CHBS
> >>>
> >>>
> >>> I think this looks fine. But did a platform have issues with this?
> >>
> >> Not really, actually, I discovered it while reviewing the code and
> >> CXL specification.
> >>
> >> Currently, this issue arises only when I inject an incorrect length
> >> via QEMU environment. Our hardware does not experience this problem.
> >>
> >>
> >>> Does this need to be backported?
> >> I remain neutral :)
> >
> > What does the kernel do with this invalid CHBS from QEMU? I would be
> > happy to let whatever bad effect from injecting a corrupted CHBS just
> > happen because there are plenty of ways for QEMU to confuse the kernel
> > even if the table lengths are correct.
> >
> > Unless it has real impact I would rather not touch the kernel for every
> > possible way that QEMU can make a mistake.
>
>
>
> Thank you for the feedback.
>
> If your earlier comments were specifically about ***backporting*** this patch,
> I agree there might not be an urgent need for that.
>
> However, regarding the discussion on whether this patch should be accepted
> upstream, TBH, I believe it is necessary.
>
> 1. The **CXL Specification (r3.1, Table 9-21)** explicitly defines `length`
> requirements for CHBS in both CXL 1.1 and CXL 2.0 cases. Failing to
> validate this field against the spec risks misinterpretation of invalid
> configurations.
>
> 2. As mentioned in section **2.13.8** of the *CXL Memory Device Software Guide (Rev 1.0)*,
> It's recommended to verify the CHBS length.
>
> While the immediate impact might be limited to edge cases (e.g., incorrect QEMU configurations),
> upstreaming this aligns the kernel with spec-mandated checks and improves
> robustness for future use cases.
>
> [1] https://cdrdv2-public.intel.com/643805/643805_CXL_Memory_Device_SW_Guide_Rev1_1.pdf
Just to check - are we talking hacked QEMU or some configuration of QEMU that
can generate the wrong length?
Jonathan
>
>
> >
> > I.e. if it was a widespread problem that affected multiple QEMU users by
> > default then maybe. Just your local test gone awry? Maybe not