Re: [PATCH v8 02/21] cxl/mem: Read dynamic capacity configuration from the device

From: Ira Weiny
Date: Wed Jan 15 2025 - 15:48:57 EST


Alejandro Lucero Palau wrote:
>
> On 1/15/25 02:35, Dan Williams wrote:
> > Ira Weiny wrote:

[snip]

> >> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> >> index e8907c403edbd83c8a36b8d013c6bc3391207ee6..05a0718aea73b3b2a02c608bae198eac7c462523 100644
> >> --- a/drivers/cxl/cxlmem.h
> >> +++ b/drivers/cxl/cxlmem.h
> >> @@ -403,6 +403,7 @@ enum cxl_devtype {
> >> CXL_DEVTYPE_CLASSMEM,
> >> };
> >>
> >> +#define CXL_MAX_DC_REGION 8
> > Please no, lets not sign up to have the "which cxl 'region' concept are
> > you referring to?" debate in perpetuity. "DPA partition", "DPA
> > resource", "DPA capacity" anything but "region".
> >
> >
>
> This next comment is not my main point to discuss in this email
> (resources initialization is), but I seize it for giving my view in this
> one.
>
> Dan, you say later we (Linux) are not obligated to use "questionable
> naming decisions of specifications", but we should not confuse people
> either.
>
> Maybe CXL_MAX_DC_HW_REGION would help here, for differentiating it from
> the kernel software cxl region construct. I think we will need a CXL
> kernel dictionary sooner or later ...

I agree. I have had folks confused between spec and code and I'm really trying
to differentiate hardware region vs software partition.

>
> >> /**
> >> * struct cxl_dpa_perf - DPA performance property entry
> >> * @dpa_range: range for DPA address
> >> @@ -434,6 +435,8 @@ struct cxl_dpa_perf {
> >> * @dpa_res: Overall DPA resource tree for the device
> >> * @pmem_res: Active Persistent memory capacity configuration
> >> * @ram_res: Active Volatile memory capacity configuration
> >> + * @dc_res: Active Dynamic Capacity memory configuration for each possible
> >> + * region
> >> * @serial: PCIe Device Serial Number
> >> * @type: Generic Memory Class device or Vendor Specific Memory device
> >> * @cxl_mbox: CXL mailbox context
> >> @@ -449,11 +452,23 @@ struct cxl_dev_state {
> >> struct resource dpa_res;
> >> struct resource pmem_res;
> >> struct resource ram_res;
> >> + struct resource dc_res[CXL_MAX_DC_REGION];
> > This is throwing off cargo-cult alarms. The named pmem_res and ram_res
> > served us well up until the point where DPA partitions grew past 2 types
> > at well defined locations. I like the array of resources idea, but that
> > begs the question why not put all partition information into an array?
> >
> > This would also head off complications later on in this series where the
> > DPA capacity reservation and allocation flows have "dc" sidecars bolted
> > on rather than general semantics like "allocating from partition index N
> > means that all partitions indices less than N need to be skipped and
> > marked reserved".
>
>
> I guess this is likely how you want to change the type2 resource
> initialization issue and where I'm afraid these two patchsets are going
> to collide at.
>
> If that is the case, both are going to miss the next kernel cycle since
> it means major changes, but let's discuss it without further delays for
> the sake of implementing the accepted changes as soon as possible, and I
> guess with a close sync between Ira and I.
>
> BTW, in the case of the Type2, there are more things to discuss which I
> do there.

I'm looking at your set again because I think I missed this detail.

After looking into this more I think a singular array of resources could be
done without to much major surgery.

The question for type 2 is what interface does the core export for
accelerators to request these resources? Or do we export a function like
add_dpa_res() and let drivers do that directly?

Dan is concerned about storing duplicate information about the partitions.
For DCD I think it should call add_dpa_res() to create resources on the
fly as I detect partition information from the device. For type 2 they
can call that however/whenever they want.

We can even make this an xarray for complete flexibility with how many
partitions a device can have. Although I'm not sure if the spec allows
for that on type 2. Does it?

Ira

[snip]