Re: [PATCH RFC v2 03/18] cxl/mem: Read Dynamic capacity configuration from the device

From: Ira Weiny
Date: Sun Sep 03 2023 - 19:36:35 EST


Jonathan Cameron wrote:
> On Mon, 28 Aug 2023 22:20:54 -0700
> ira.weiny@xxxxxxxxx wrote:
>
> > From: Navneet Singh <navneet.singh@xxxxxxxxx>
> >
> > Devices can optionally support Dynamic Capacity (DC). These devices are
> > known as Dynamic Capacity Devices (DCD).
> >
> > Implement the DC (opcode 48XXh) mailbox commands as specified in CXL 3.0
> > section 8.2.9.8.9. Read the DC configuration and store the DC region
> > information in the device state.
> >
> > Co-developed-by: Navneet Singh <navneet.singh@xxxxxxxxx>
> > Signed-off-by: Navneet Singh <navneet.singh@xxxxxxxxx>
> > Signed-off-by: Ira Weiny <ira.weiny@xxxxxxxxx>
> >
> Hi.
>
> A few minor things inline. Otherwise, I wonder if it's worth separating
> the mode of the region from that of the endpoint decoder in a precusor patch.
> That's a large part of this one and not really related to the mbox command stuff.

I've taken some time looking through my backup branches because I thought
this was a separate patch. I'm feeling like this was a rebase error where
some of the next patch got merged here accidentally. I agree it seems a
good idea to have it separate but I can't confirm at this point if it was
originally.

Split done.

[snip]

> > +
> > + rc = dc_resp->avail_region_count - start_region;
> > +
> > + /*
> > + * The number of regions in the payload may have been truncated due to
> > + * payload_size limits; if so adjust the count in this query.
>
> Not adjusting the query. "if so adjust the returned count to match."

Yep done!

>
> > + */
> > + if (mbox_cmd.size_out < sizeof(*dc_resp))
> > + rc = CXL_REGIONS_RETURNED(mbox_cmd.size_out);
> > +
> > + dev_dbg(dev, "Read %d/%d DC regions\n", rc, dc_resp->avail_region_count);
> > +
> > + return rc;
> > +}
> > +
> > +/**
> > + * cxl_dev_dynamic_capacity_identify() - Reads the dynamic capacity
> > + * information from the device.
> > + * @mds: The memory device state
> > + *
> > + * This will dispatch the get_dynamic_capacity command to the device
> > + * and on success populate structures to be exported to sysfs.
>
> I'd skip the 'exported to sysfs' as I'd guess this will have other uses
> (maybe) in the longer term.
>
> and on success populate state structures for later use.

Yea that was poorly worded. Changed to:

Read Dynamic Capacity information from the device and populate the
state structures for later use.

>
> > + *
> > + * Return: 0 if identify was executed successfully, -ERRNO on error.
> > + */
> > +int cxl_dev_dynamic_capacity_identify(struct cxl_memdev_state *mds)
> > +{
> > + struct cxl_mbox_dynamic_capacity *dc_resp;
> > + struct device *dev = mds->cxlds.dev;
> > + size_t dc_resp_size = mds->payload_size;
> > + u8 start_region;
> > + int i, rc = 0;
> > +
> > + for (i = 0; i < CXL_MAX_DC_REGION; i++)
> > + snprintf(mds->dc_region[i].name, CXL_DC_REGION_STRLEN, "<nil>");
> > +
> > + /* Check GET_DC_CONFIG is supported by device */
> > + if (!test_bit(CXL_DCD_ENABLED_GET_CONFIG, mds->dcd_cmds)) {
> > + dev_dbg(dev, "unsupported cmd: get_dynamic_capacity_config\n");
> > + return 0;
> > + }
> > +
> > + dc_resp = kvmalloc(dc_resp_size, GFP_KERNEL);
> > + if (!dc_resp)
> > + return -ENOMEM;
> > +
> > + start_region = 0;
> > + do {
> > + int j;
> > +
> > + rc = cxl_get_dc_id(mds, start_region, dc_resp, dc_resp_size);
>
> I'd spell out identify.
> Initially I thought this was getting an index.

Actually this is getting the DC configuration. So I'm changing it to.

cxl_get_dc_config()

>
>
> > + if (rc < 0)
> > + goto free_resp;
> > +
> > + mds->nr_dc_region += rc;
> > +
> > + if (mds->nr_dc_region < 1 || mds->nr_dc_region > CXL_MAX_DC_REGION) {
> > + dev_err(dev, "Invalid num of dynamic capacity regions %d\n",
> > + mds->nr_dc_region);
> > + rc = -EINVAL;
> > + goto free_resp;
> > + }
> > +
> > + for (i = start_region, j = 0; i < mds->nr_dc_region; i++, j++) {
> > + rc = cxl_dc_save_region_info(mds, i, &dc_resp->region[j]);
> > + if (rc)
> > + goto free_resp;
> > + }
> > +
> > + start_region = mds->nr_dc_region;
> > +
> > + } while (mds->nr_dc_region < dc_resp->avail_region_count);
> > +
> > + mds->dynamic_cap =
> > + mds->dc_region[mds->nr_dc_region - 1].base +
> > + mds->dc_region[mds->nr_dc_region - 1].decode_len -
> > + mds->dc_region[0].base;
> > + dev_dbg(dev, "Total dynamic capacity: %#llx\n", mds->dynamic_cap);
> > +
> > +free_resp:
> > + kfree(dc_resp);
>
> Maybe a first use for __free in cxl?
>
> See include/linux/cleanup.h
> Would enable returns rather than goto and label.
>

Good idea. Done.

>
>
> > + if (rc)
> > + dev_err(dev, "Failed to get DC info: %d\n", rc);
>
> I'd prefer to see more specific debug in the few paths that don't already
> print it above.

With the use of __free it kind of went the same way.

Done.

>
> > + return rc;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_dev_dynamic_capacity_identify, CXL);
> > +
> > static int add_dpa_res(struct device *dev, struct resource *parent,
> > struct resource *res, resource_size_t start,
> > resource_size_t size, const char *type)
> > @@ -1208,8 +1369,12 @@ int cxl_mem_create_range_info(struct cxl_memdev_state *mds)
> > {
> > struct cxl_dev_state *cxlds = &mds->cxlds;
> > struct device *dev = cxlds->dev;
> > + size_t untenanted_mem;
> > int rc;
> >
> > + untenanted_mem = mds->dc_region[0].base - mds->static_cap;
> > + mds->total_bytes = mds->static_cap + untenanted_mem + mds->dynamic_cap;
> > +
> > if (!cxlds->media_ready) {
> > cxlds->dpa_res = DEFINE_RES_MEM(0, 0);
> > cxlds->ram_res = DEFINE_RES_MEM(0, 0);
> > @@ -1217,8 +1382,16 @@ int cxl_mem_create_range_info(struct cxl_memdev_state *mds)
> > return 0;
> > }
> >
> > - cxlds->dpa_res =
> > - (struct resource)DEFINE_RES_MEM(0, mds->total_bytes);
> > + cxlds->dpa_res = (struct resource)DEFINE_RES_MEM(0, mds->total_bytes);
>
> Beat back that auto-formater! Or just run it once and fix everything before
> doing anything new.

Will do.

[snip]

> >
> > @@ -2234,7 +2247,7 @@ static struct cxl_region *cxl_region_alloc(struct cxl_root_decoder *cxlrd, int i
> > * devm_cxl_add_region - Adds a region to a decoder
> > * @cxlrd: root decoder
> > * @id: memregion id to create, or memregion_free() on failure
> > - * @mode: mode for the endpoint decoders of this region
> > + * @mode: mode of this region
> > * @type: select whether this is an expander or accelerator (type-2 or type-3)
> > *
> > * This is the second step of region initialization. Regions exist within an
> > @@ -2245,7 +2258,7 @@ static struct cxl_region *cxl_region_alloc(struct cxl_root_decoder *cxlrd, int i
> > */
> > static struct cxl_region *devm_cxl_add_region(struct cxl_root_decoder *cxlrd,
> > int id,
> > - enum cxl_decoder_mode mode,
> > + enum cxl_region_mode mode,
> > enum cxl_decoder_type type)
> > {
> > struct cxl_port *port = to_cxl_port(cxlrd->cxlsd.cxld.dev.parent);
> > @@ -2254,11 +2267,12 @@ static struct cxl_region *devm_cxl_add_region(struct cxl_root_decoder *cxlrd,
> > int rc;
> >
> > switch (mode) {
> > - case CXL_DECODER_RAM:
> > - case CXL_DECODER_PMEM:
> > + case CXL_REGION_RAM:
> > + case CXL_REGION_PMEM:
> > break;
> > default:
> > - dev_err(&cxlrd->cxlsd.cxld.dev, "unsupported mode %d\n", mode);
>
> Arguably should have been moved to the cxl_decoder_mode_name() in patch 1
> before being changed to cxl_region_mode_name() when the two are separated in this
> patch. You could just add a note to patch 1 to say 'other instances will be
> covered by refactors shortly'.

Ah well I've already split that out and sent it. I was hoping little
things like that could land quickly and we could get to the larger patches
in this series. For now I'm going to leave it (But split out as part of
the region mode patch).

[snip]

> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index cd4a9ffdacc7..ed282dcd5cf5 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -374,6 +374,28 @@ static inline const char *cxl_decoder_mode_name(enum cxl_decoder_mode mode)
> > return "mixed";
> > }
> >
> > +enum cxl_region_mode {
> > + CXL_REGION_NONE,
> > + CXL_REGION_RAM,
> > + CXL_REGION_PMEM,
> > + CXL_REGION_MIXED,
> > + CXL_REGION_DEAD,
> > +};
>
> It feels to me like you could have yanked the introduction and use of cxl_region_mode
> out as a trivial precursor patch with a note saying the separation will be needed
> shortly and why it will be needed.

Yep done. Like I said I think I had this split out at some point ...
It's immaterial now.

[snip]

> >
> > +#define CXL_DC_REGION_STRLEN 7
> > +struct cxl_dc_region_info {
> > + u64 base;
> > + u64 decode_len;
> > + u64 len;
> > + u64 blk_size;
> > + u32 dsmad_handle;
> > + u8 flags;
> > + u8 name[CXL_DC_REGION_STRLEN];
> > +};
> > +
> > /**
> > * struct cxl_memdev_state - Generic Type-3 Memory Device Class driver data
> > *
> > @@ -449,6 +464,8 @@ struct cxl_dev_state {
> > * @enabled_cmds: Hardware commands found enabled in CEL.
> > * @exclusive_cmds: Commands that are kernel-internal only
> > * @total_bytes: sum of all possible capacities
> > + * @static_cap: Sum of RAM and PMEM capacities
>
> Sum of static RAM and PMEM capacities
>
> Dynamic cap may well be RAM or PMEM!

Indeed! Done.

[snip]

> >
> > /*
> > @@ -741,9 +771,31 @@ struct cxl_mbox_set_partition_info {
> > __le64 volatile_capacity;
> > u8 flags;
> > } __packed;
> > -
>
> ?

I just missed it when self reviewing. Fixed.

>
> > #define CXL_SET_PARTITION_IMMEDIATE_FLAG BIT(0)
> >
> > +struct cxl_mbox_get_dc_config {
> > + u8 region_count;
> > + u8 start_region_index;
> > +} __packed;
> > +
> > +/* See CXL 3.0 Table 125 get dynamic capacity config Output Payload */
> > +struct cxl_mbox_dynamic_capacity {
>
> Can we rename to make it more clear which payload this is?

Sure.

>
> > + u8 avail_region_count;
> > + u8 rsvd[7];
> > + struct cxl_dc_region_config {
> > + __le64 region_base;
> > + __le64 region_decode_length;
> > + __le64 region_length;
> > + __le64 region_block_size;
> > + __le32 region_dsmad_handle;
> > + u8 flags;
> > + u8 rsvd[3];
> > + } __packed region[];
> > +} __packed;
> > +#define CXL_DYNAMIC_CAPACITY_SANITIZE_ON_RELEASE_FLAG BIT(0)
> > +#define CXL_REGIONS_RETURNED(size_out) \
> > + ((size_out - 8) / sizeof(struct cxl_dc_region_config))
> > +
> > /* Set Timestamp CXL 3.0 Spec 8.2.9.4.2 */
> > struct cxl_mbox_set_timestamp_in {
> > __le64 timestamp;
> > @@ -867,6 +919,7 @@ enum {
> > int cxl_internal_send_cmd(struct cxl_memdev_state *mds,
> > struct cxl_mbox_cmd *cmd);
> > int cxl_dev_state_identify(struct cxl_memdev_state *mds);
> > +int cxl_dev_dynamic_capacity_identify(struct cxl_memdev_state *mds);
> > int cxl_await_media_ready(struct cxl_dev_state *cxlds);
> > int cxl_enumerate_cmds(struct cxl_memdev_state *mds);
> > int cxl_mem_create_range_info(struct cxl_memdev_state *mds);
>
> ta

ta?

Ira