Re: [PATCH 0/4 v3] cxl/core: Enable Region creation on x86 with Low Mem Hole

From: Dave Jiang
Date: Wed Apr 02 2025 - 11:31:18 EST




On 4/2/25 4:51 AM, Robert Richter wrote:
> Dave,
>
> thank you for your answer.
>
> On 28.03.25 14:10:00, Dave Jiang wrote:
>>
>>
>> On 3/28/25 2:02 AM, Robert Richter wrote:
>>> On 25.03.25 17:13:50, Fabio M. De Francesco wrote:
>
>>>> Interference? Do you mean that this series would make the driver fail on
>>>> other platforms?
>>>
>>> No, other platforms must deal with that specific code and constrains.
>>>
>>>>
>>>> Of course I don't want anything like that. I'm not clear about it...
>>>> Would you please describe how would this series interfere and what
>>>> would happen on other platforms?
>>>
>>> Other platforms should not care about platform-specifics of others. So
>>> again, use a platform check and only enable that code there necessary.
>>> And this requires a well defined interface to common code.
>>
>> Hi Robert,
>
>> Can you please share more on the background information and/or your
>> specific concerns on the possible memory holes in the other
>> platforms that need to be considered and not covered by Fabio's
>> code? Let's all get on the same page of what specifics we need to
>> consider to make this work. Preferably I want to avoid arch and
>> platform specific code in CXL if possible. Of course that may not
>> always be possible. Would like see if we can avoid a bunch of #ifdef
>> X86/ARM/INTEL/AMD and do it more cleanly. But fully understand the
>> situation first would be helpful to determine that. Thank you!
>
> We implement a "special" case in the main path. This adds unnecessary
> complexity to the code, makes it hard to maintain, change or even to
> understand in the future. It becomes more error-prone. Though it is
> limited to x86 arch, the code runs for all platforms. A reuse for
> other archs will enable it for all platforms of that archs too.
>
> This general approach to add "special" cases does not scale. We see
> this already with the "extended linear cache" and now the "low mem
> hole". While I am fine with all those special cases (AMD address
> translation is another), we need a proper way to enable and implement
> those by reducing complexity and with a good isolation. This makes
> future changes easier and reduces conflicts with other
> implementations.

I'm more of thinking that if those special cases are detectable rather than a set of ambiguous rules then we might address those quirks in a way better than #ifdefs. For "extended linear cache", it is detected via HMAT spec change. So while only Intel implements this right now on a platform, other vendors can and may in the future. The LMH is more difficult as there are no are no standard ways to enumerate it. Hopefully a set of clear rules will define this. It does look like Dan is trying to get the CXL spec to clearly define this and discussion in the WG is coming in the next couple weeks. The AMD translation can be detected by seeing if certain ACPI callback methods exist right? Is there more required to detect the special translation needs to be applied? But I do agree with the reducing complexity for maintenance and future implementations.

>
> The change of this series does not much, just find a CFMWS region that
> is unaligned to the EP decoder's range and then just shrink the used
> SPA range of the EP to match that region. That can be implemented in a
> very simple way if we introduce a spa_range paramater plus a custom
> port setup. The generalized part of my address translation part alrady
> implements this, it can be reused here. To implement LMH support only
> the following is needed then:
>
> * add a setup function with a platform check to add a custom
> callback,
>
> * the callback checks for the LMH range and adjusts the spa_range.
>
> The modified spa_range matches then with the region range (no changes
> needed here). That's it.
>
> I can help making this work.

Code is always welcome :)
>
> I hope that makes sense?

Yes. Appreciate you explaining. thank you!
>
> Thanks,
>
> -Robert
>