Re: [PATCH RFC v2 03/18] cxl/mem: Read Dynamic capacity configuration from the device

From: Ira Weiny
Date: Mon Sep 11 2023 - 17:17:01 EST


Jørgen Hansen wrote:
> On 8/29/23 07:20, ira.weiny@xxxxxxxxx wrote:
> > From: Navneet Singh <navneet.singh@xxxxxxxxx>
> >

[snip]

> > /**
> > * struct cxl_memdev_state - Generic Type-3 Memory Device Class driver data
> > *
> > @@ -449,6 +464,8 @@ struct cxl_dev_state {
> > * @enabled_cmds: Hardware commands found enabled in CEL.
> > * @exclusive_cmds: Commands that are kernel-internal only
> > * @total_bytes: sum of all possible capacities
> > + * @static_cap: Sum of RAM and PMEM capacities
> > + * @dynamic_cap: Complete DPA range occupied by DC regions
> > * @volatile_only_bytes: hard volatile capacity
> > * @persistent_only_bytes: hard persistent capacity
> > * @partition_align_bytes: alignment size for partition-able capacity
> > @@ -456,6 +473,10 @@ struct cxl_dev_state {
> > * @active_persistent_bytes: sum of hard + soft persistent
> > * @next_volatile_bytes: volatile capacity change pending device reset
> > * @next_persistent_bytes: persistent capacity change pending device reset
> > + * @nr_dc_region: number of DC regions implemented in the memory device
> > + * @dc_region: array containing info about the DC regions
> > + * @dc_event_log_size: The number of events the device can store in the
> > + * Dynamic Capacity Event Log before it overflows
> > * @event: event log driver state
> > * @poison: poison driver state info
> > * @fw: firmware upload / activation state
> > @@ -473,7 +494,10 @@ struct cxl_memdev_state {
> > DECLARE_BITMAP(dcd_cmds, CXL_DCD_ENABLED_MAX);
> > DECLARE_BITMAP(enabled_cmds, CXL_MEM_COMMAND_ID_MAX);
> > DECLARE_BITMAP(exclusive_cmds, CXL_MEM_COMMAND_ID_MAX);
> > +
> > u64 total_bytes;
> > + u64 static_cap;
> > + u64 dynamic_cap;
> > u64 volatile_only_bytes;
> > u64 persistent_only_bytes;
> > u64 partition_align_bytes;
> > @@ -481,6 +505,11 @@ struct cxl_memdev_state {
> > u64 active_persistent_bytes;
> > u64 next_volatile_bytes;
> > u64 next_persistent_bytes;
> > +
> > + u8 nr_dc_region;
> > + struct cxl_dc_region_info dc_region[CXL_MAX_DC_REGION];
> > + size_t dc_event_log_size;
> > +
> > struct cxl_event_state event;
> > struct cxl_poison_state poison;
> > struct cxl_security_state security;
> > @@ -587,6 +616,7 @@ struct cxl_mbox_identify {
> > __le16 inject_poison_limit;
> > u8 poison_caps;
> > u8 qos_telemetry_caps;
> > + __le16 dc_event_log_size;
> > } __packed;
>
> Hi,
>
> To handle backwards compatibility with CXL 2.0 devices,
> cxl_dev_state_identify() needs to handle both the CXL 2.0 and 3.0
> versions of struct cxl_mbox_identify.
> The spec says that newer code can
> use the payload size to detect the different versions, so something like
> the following:

Software does not need to detect the different version. The spec states
that the payload size or a zero value can be used.

"... software written to the new definition can use the zero value
^^^^^^^^^^^^^^
or the payload size to detect devices that do not support the new
field."

A log size of 0 is valid and is indicative of no DC support.

That said the current code could interpret the log size as larger because
id is not correctly initialized. So good catch.

However, dc_event_log_size is not used anywhere. For this reason alone I
almost removed it from the code. This complication gives me even more
reason to do so.

>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 9462c34aa1dc..0a6f038996aa 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -1356,6 +1356,7 @@ int cxl_dev_state_identify(struct cxl_memdev_state
> *mds)
> .opcode = CXL_MBOX_OP_IDENTIFY,
> .size_out = sizeof(id),
> .payload_out = &id,
> + .min_out = CXL_MBOX_IDENTIFY_MIN_LENGTH,
> };
> rc = cxl_internal_send_cmd(mds, &mbox_cmd);
> if (rc < 0)
> @@ -1379,7 +1380,8 @@ int cxl_dev_state_identify(struct cxl_memdev_state
> *mds)
> mds->poison.max_errors = min_t(u32, val,
> CXL_POISON_LIST_MAX);
> }
>
> - mds->dc_event_log_size = le16_to_cpu(id.dc_event_log_size);
> + if (mbox_cmd.size_out >= CXL_MBOX_IDENTIFY_CXL3_LENGTH)
> + mds->dc_event_log_size = le16_to_cpu(id.dc_event_log_size);
>
> return 0;
> }
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index ae9dcb291c75..756e30db10d6 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -629,8 +629,11 @@ struct cxl_mbox_identify {
> __le16 inject_poison_limit;
> u8 poison_caps;
> u8 qos_telemetry_caps;
> + /* CXL 3.0 additions */
> __le16 dc_event_log_size;
> } __packed;
> +#define CXL_MBOX_IDENTIFY_MIN_LENGTH 0x43
> +#define CXL_MBOX_IDENTIFY_CXL3_LENGTH sizeof(struct cxl_mbox_identify)
>
> /*
> * Common Event Record Format
>
> ---
>
> Something similar needs to be handled for cxl_event_get_int_policy with
> the addition of dyncap_settings to cxl_event_interrupt_policy, that Fan
> Ni mentions.

Yes this needs to be handled. I've overlooked that entire part. I think
it had something to do with the fact the 3.0 errata was not published when
the first RFC was sent out and this version just continued with the broken
code.

Thanks for pointing this out and thanks for the review!
Ira